The migration of the Neotoma database and software is nearing completion. It’s been a complex process and we apologize for any delays or disruptions caused by it. Here’s an update on where things stand. See also this summary table of individual components, their current status, and locations.
Overview: This has been a complete and two-stage migration of the Neotoma database and all primary software services (APIs, DOIs, Explorer, neotoma R package, Tilia). In the first stage, we moved everything from an older Windows server to a newer Windows server at Penn State. This process began in mid July and completed in late August, and was triggered by a reorganization of IT services at Penn State. In the second stage, we moved everything from Windows (Windows server, SQL-Server implementation of database) to Linux (Linux server, PostgreSQL implementation of database. This second-stage migration, in the works for several years, is motivated by sustainability goals. By shifting to an open-source software framework (Windows software is proprietary; Linux software is open-source), we lower software licensing costs and broaden the pool of developers and scientists able to support and enhance the code.
Current Status: Two versions of the Neotoma ‘stack’ (i.e. the database and accompanying software) are currently running (see this summary table for details about individual components, their current status, and their locations).
The original SQL-Server version of NeotomaDB is still running on a Windows server and all components are active. However, all data uploads ceased to this version in late July 2020, so this is essentially a frozen version, maintained to support backwards compatibility. Individual software components are stable, but the new server seems to be slow, so there are some speed issues for larger data retrieval queries. This version is suitable for educational purposes and for comparative testing of research scripts against the old and new versions. We will maintain this version for as long as feasible and at least the next several months; we will definitely keep it up while we wrap up last fixes with the new version.
The new PostgreSQL version of NeotomaDB is now running on a Linux server and the backend database is fully migrated and operational. Eric Grimm has begun test uploads of real data to the new database, so the new PostgreSQL version has advanced slightly beyond the Windows version. APIs: migrated and released, with V1.5 and 2.0 the current version. Current efforts are focusing on last testing and updating documentation. Tilia: Now migrated and in testing with a few stewards. We anticipate a full release soon, i.e. in the next week or two. Explorer: Running and in last stages of bug fixing (e.g. making sure all links and pointers to APIs are updated correctly; removing case sensitivity). Two current versions: Windows Version (Stable, FrozenDB) and Postgres Version (Beta Testing, LivingDB). Once we get past this migration, we will start releasing updates to Explorer that enhance its functionality; several are in development. DOIs: The code for minting DOIs was entirely written for the PostgreSQL version of the database, and primarily draw upon the API 2.0 services. So, simple DOI-minting capabilities should be ready soon after the API documentation is updated, with new functionality to be added over the next several months. Neotoma R: A V2.0 package is at the design and development stage; this new version will both point to the PostgreSQL version of the database and include new features in response to user requests. The 1.0 package is functional and pointing to the Windows version of the database.
For Further Updates: You can track updates on the Neotoma Slack channel. We will also post quick updates to this summary table on an approximate weekly basis The status of individual components is changing quickly, as we port over individual components, find bugs (or have them reported to us), and fix them. The written summary above is dated to Oct 10, 2020. See also this technical documentation. Going forward we will post quick updates to the summary table and longer News updates as needed.
-Jack Williams, Jessica Blois, Simon Goring, Eric Grimm, Doug Miller, Jonathan Nelson, & Mike Stryker
Posted by Jack Williams on 10/13
For the last several years, the Neotoma IT team (Eric Grimm, Simon Goring, Mike Stryker, and Steve Crawford; special thanks to Anna George, Jack Williams, and Jessica Blois) has been migrating the Neotoma relational database from SQL Server (Windows-based) to PostgreSQL (Linux). Most of the effort associated with this migration has gone into transferring and testing all Tilia Data Steward services. The goal of this migration is to make Neotoma more sustainable and easily supported by a network of open-source developers.
All migration work so far has been done on a development version of the database, leaving the mainline production version (i.e. the one used by most people) unchanged. As the last stage of the migration process, we are now switching over the production version of the database. This process began Aug 10 and we hope to have the first wave of migration done by Friday Aug 21. Once this migration is complete, there will be two versions of the database: a primary living version in PostgreSQL, and a secondary static version, maintained for backwards compatibility, on SQL Server. The migration is anticipated to affect services as follows:
Explorer: Explorer services have been migrated to the new server and tested. There may be a few temporary disruptions, with a known bug to be fixed for Stratigraphic Diagrammer.
Tilia: Tilia has been migrated and tested against the PostgreSQL server and the updated API (currently at http://tilia-dev.neotomadb.org). Some Tilia-server interactions may be slower than in the past as we optimize the Postgres database, but should return to normal shortly.
API: There are new generations of APIs (v1.5, v2.0) that point to and work with the PostgreSQL version of the database, available temporarily here: http://api-dev.neotomadb.org/api-docs/ and a soon-to-be-established long-term URL here: http://api.neotomadb.org/api-docs/. The original APIs (v1.0) still point to the static SQLServer/Windows database. These V1.0 APIs are being renamed with a ‘wn’ prefix added, e.g. https://wnapi.neotomadb.org/v1/data/sites?sitename=Marion%
neotoma R package: The original neotoma R package is built on top of v1.0 APIs, and so points to the now-static SQLServer version of the database. The source code of the R package has been pointed to the new Windows server. These changes have been pushed to GitHub (http://github.com/ropensci/neotoma) and we are awaiting a full release on CRAN. A new Neotoma R package (neotoma2, breaking compatibility with the old package) will point to the living PostgreSQL version. V2 is expected to be released in late October, with pre-release expected soon. Development for this package can be found here: http://github.com/NeotomaDB/neotoma2
DOIs: As a newer feature, DOIs were designed to work with the PostgreSQL version. There may be a few bugs or temporary disruptions to DOI minting and accessibility through landing pages, but we anticipate that these will be brief.
If you run into issues, please post queries to Slack (neotomadb.slack.com) or send an email to email@example.com
Posted by Jack Williams on 08/13
Nice reporting by Science magazine on a talk presented at the 2020 Ecological Society of America meeting, by Allison Stegner and Trisha Spanbauer, showing evidence from Neotoma indicating that the impacts by human land use on ecosystems was as large or larger than the effects associated with the end of the last deglaciation https://www.sciencemag.org/news/2020/08/humans-have-altered-north-america-s-ecosystems-more-melting-glaciers
Posted by Jack Williams on 08/13
Beginning Wednesday, July 1, Neotoma will be down for several days during a server upgrade at Penn State University. This downtime will affect the Neotoma database and all software services associated with it (e.g. Neotoma Explorer, APIs, R package, Tilia-to-Neotoma services). We do not yet know the exact duration of this downtime but hope it will be less than a week. All Neotoma services on the new server are being tested during this upgrade, to ensure that everything runs smoothly afterwards. However, there is a possibility that some Neotoma services will need patching after this server upgrade.
We apologize for the short advance notice of this upgrade and for any disruptions caused. We are monitoring this situation closely and will post updates as soon as we have them.If you run into any issues during or after the server upgrade, please contact the Neotoma Executive Committee at Neotomafirstname.lastname@example.org or post a note to the NeotomaDB Slack workspace (neotomadb.slack.com #it-dev).
-Jack Williams, Jessica Blois, Steve Crawford, Simon Goring, Eric Grimm, Doug Miller, Mike Stryker
PS Many of you know that we are also working on a backend migration of the Neotoma database from a SQL Server/Windows server to a PostgreSQL/Linux server. That migration is nearly complete but is separate from this PSU server upgrade. We likely will need to schedule a second downtime window to complete this migration. We will send out updates as soon as we establish the timeline for this database migration.
[Update 6/26/2020 3:13pm US CT: changed contact email to Neotomaemail@example.com]
[Update 8/5/2020 The server upgrade is underway and we are experiencing a few broken links as we repoint services from the old PSU server to the new one. Additionally, services are running slower than normal, which we are working on as well. We are working to get these issues resolved as fast as possible.]
Posted by Jack Williams on 06/26
The Leadership Council approved the new bylaws adding early-career representatives to the council as full voting members! 9 people voted, 9 yes to 0 no. In addition, after internal discussion the LC unanimously voted to elevate the ad hoc ECR reps to full voting reps rather than holding a new election, since they were just voted into their roles in late fall 2019. So welcome to Dana Reuter and Suzette Flantua to the Neotoma Leadership Council! They will both be on the council through the end of their original term, December 2021.
Posted by Jessica Blois on 05/16
The Leadership Council unanimously approved the addition of a new constituent database to the Neotoma Paleoecology Database! The new database is named PaVeLA (PaleoVertebrates of Latin America). Joaquín Arroyo-Cabrales (Instituto Nacional de Antropología e Historia, México) is the lead for the database, and is working with graduate student Deborah V. Espinosa-Martínez and overall vertebrates lead Jessica Blois on integrating the Quaternary Mammals of Mexico Database into PaVeLA. If you are interested in adding sites to the database, or learning more, please contact Joaquín Arroyo-Cabrales (firstname.lastname@example.org) or Jessica Blois (email@example.com).
Posted by Jessica Blois on 04/22
Neotoma and the University of Wisconsin will be hosting a workshop to lay the groundwork for a robust cyberinfrastructure dedicated to supporting the open access for ancient environmental DNA and integrating it with existing cyberinfrastructure in genomics and paleoecology. This workshop will be held May 18-20 in Madison, WI, USA. The workshop is supported by the National Science Foundation and standard travel costs will be covered for all participants.
This ca. 20-person workshop includes several open participant slots specifically dedicated to early career scientists and scientists from underrepresented populations. If you are interested in attending, please email your CV and a short statement of research interests to Jack Williams (firstname.lastname@example.org) by March 6.
Planned activities include: 1) review the current state of the art with respect to ancient DNA data generation, data handling, data archival, and analytical pipelines; 2) review the latest advances and capabilities of existing cyberinfrastructures in genomics and paleoecology; 3) identify paleogenomic-relevant gaps and semantic misalignments among existing resources; 4) establish priorities and initial standards for data and metadata reporting in community paleodata resources. The Steering Committee for this workshop is Inger Alsos, Jessica Blois, Mary Edwards, Eric Grimm, Laura Parducci, Beth Shapiro, and Jack Williams.
If you have any questions please reach out to Jack Williams.
Posted by Jack Williams on 02/17
The Neotoma Data Use Policy has been updated and posted. Neotoma data are provided under a CC-BY 4.0 license and are open for use with citation. The revised Data Use Policy clarifies that proper citation of Neotoma data operates at three levels: Neotoma itself, the Constituent Database(s) that curate these data, and the original data providers. Appendix 1 provides standard citation endpoint and acknowledgment text for Neotoma itself, while Appendix 2 provides preferred citation endpoints and acknowledgments for individual Constituent Databases.
Posted by Jack Williams on 01/15