ResearchEquals Archival strategy

ResearchEquals Archival strategy
Photo by C M / Unsplash

When we publish research, we are committing to preserving that research content for as long as we can. At the same time, we need to be realistic and recognize that Liberate Science (”the business”) is still young and it remains uncertain how long we will be around. How do we make sure copies of the research are archived and available in futures that may or may not include the business?

To that end, we are today introducing the first iteration of our archival strategy for ResearchEquals. This first version is about ensuring that research published on ResearchEquals remains available beyond the immediate existence of Liberate Science as a business.

We call this the immediate archive strategy.

Right now, availability of ResearchEquals content is dependent on the business paying a set of bills (rwebsite services and content hosting, primarily). If tomorrow we stopped paying those bills for whatever reason, content would start disappearing from the scholarly record in a matter of weeks. We can and will keep paying those bills, but we are a small organization and we need to prepare for unlikely but high risk events that could result in exactly such a situation. Even if no such situation arises, the immediate archival strategy creates a base layer to continue building on.

For the immediate archive to ensure availability of ResearchEquals content, we need…

  • to be readily available and up to date
  • …it to be fully automated
  • …its storage to come with minimal cost, promoting longevity
  • …to archive all content for which we register DOIs (i.e., modules and collections)
  • …a low effort way to switch to it when needed, without breaking existing links (e.g., DOI links)
  • …to increase the bus-factor of domain administration, such that switching to the archive remains possible in case of an incapacitating situation (e.g., fatal accident)

The ResearchEquals archive does not need to ensure new content can be published — only that the existing content remains available.

Such an immediate archive would expand our time horizon from several weeks to several years. Also, we can use the immediate archive as a redundancy measure (e.g., readily available at any time at, with which we can even improve access when the main site is down temporarily. This way, all content remains available even in case of an outage thanks to the immediate archive.

The first iteration of our archival strategy is clear: We need an immediate archive. This is not the standard (C)LOCKSS archive where copies are stored in many places, or another archival provider mechanism that requires depositing content elsewhere. We focus on what we can do locally to extend our timeline, first.

This strategy is only the first step. Nonetheless, this strategy provides concrete requirements that, once fulfilled, ensure ResearchEquals content stays available regardless of the business’ existence. We will implement this in a way that requires minimal operating costs and minimal intervention in case of need.

Upon implementation of this strategy, we can start thinking about securing the published content for even longer time horizons than several years (e.g., 10 years, 100 years, 1000 years). We'll follow up in the next months about how we are going to implement the immediate archive.

Join us on our open journey!