Modular research with Octopus and Hypergraph

In this post, Alexandra Freeman and Chris Hartgerink (the founders of both projects) talk about what makes their projects similar and what sets them apart, in order to better understand future work that can be done cooperatively.

Octopus and Hypergraph are both projects trying to change how researchers communicate their work. By changing the unit of publication from an entire research project to individual steps, they both try to improve reproducibility, access, and other issues in research.

In this post, Alexandra Freeman and Chris Hartgerink (the founders of both projects) talk about what makes their projects similar and what sets them apart, in order to better understand future work that can be done cooperatively.

CHJH: Alex, I’m still amazed we came up with this around the same time. I published about it in April 2018, you won an award from the Royal Society in November 2018. Why do you think ‘as you go’ research publishing has not happened before?

AF: I know - ‘micropublishing’ has been a bit of a niche interest until recently. I think that the world of academia (or at least academic science) tends to be rather conservative. Whilst researchers are used to using the internet to its full capabilities in every other part of their life, when it comes to the most fundamentally entrenched part of it, I think it’s too easy to think that ‘this is the way it’s been done forever, so this is the way that it should be done’ and also for researchers to feel powerless and that the ‘publishing system’ is something that is imposed upon them and not something they themselves can influence. I think there have been others who think like us in the past, but I feel that only now is there a bit more of a growing consensus that ‘something has to change’ and that is hopefully inspiring more people to think about and discuss increasingly radical models - like micropublishing.

AF: So my question to you now, Chris. We each have our ideas about how and why breaking up the ‘paper’ as the unit of publication is so important. What was the main factor that led you to this way of thinking?

CHJH: In 2017, the International Federation of Library Associations (IFLA) asked me to contribute to their conference session “Being open about open.” In preparation for that, I asked myself, what could the publishing system look like if we started from scratch today? Knowing what we know and with the tools available to us, this thought exercise resulted in the first steps down this road. How would a system look like that included preregistration from the start, instead of just an afterthought? How would a system look like that could preempt selective publication based on results? I started thinking about this step by step process and it crystallised that a lot of issues arise from the ‘after the fact’ nature of the article. When I found a 1998 article by an Elsevier researcher proposing something very similar, I felt like the time was more ripe now than back then.

CHJH: I notice you call it ‘micropublishing’, where I call it ‘as you go.’ When I think of micropublications, I think of micro papers containing one specific finding, which is different. Could you expand a bit more about what you specifically understand with micropublishing and why you call it that?

AF: Oh our journeys are so similar!

So, micropublishing is a term that other people have used, and I took it to mean publishing in smaller units than a paper rather than mini-papers - but maybe I’m misunderstanding!

Although, when explaining what you and I mean by publishing in ‘smaller units’ than a paper, I sometimes say things like ‘publishing as you go’, I also think that phrase can be misleading. I want the system to enable a different kind of working, and to encourage greater specialisation within scientific research. By that I don’t mean specialisation by field, but specialisation by skill-set. The most obvious example might be statistical analysis of data, which at the moment is often done by non-specialists (and sometimes dangerously badly as a result), but it is equally true of protocol design, or data collection (think of the very specialised laboratory or field skills required in many areas of science). I think we should be encouraging people to feel they can specialise and develop top-notch skills and get full credit for their work using those skills, without needing to do everything else included within a traditional paper.

So, it is not just ‘publishing as you go’ - doing the same as you would if you were writing a traditional paper and chopping it up. Sure, you can do that, but much more excitingly, you can avoid doing that: you can instead do the bits you enjoy/excel at/have the resources for at the moment and still get all the credit you deserve for doing that work excellently. That, I think, will open up science a lot more, really make it a ‘collaborative project’ - where you can add pieces to the grand jigsaw of scientific research, collaborating with others who may live in a different place and a different time and speak a different language from you.

AF: So, the big question! Why has the system not changed despite people having been talking about this for more than 30 years? And how do we make it change now?

CHJH: The language surrounding this is fascinating, hearing your perspective makes me think about whether as you go is clear enough (I’d been reading it more like what you envision). Why the system hasn’t changed is a good question, and on a short timeframe it does feel like that.

I like to think about the historic level of change for this. Just like the printing press took over 150 years to revolutionize how we shared information through books, the Internet and the Web are still young at around 40 years. So the system has not not changed, but it is changing. We are part of that change.

Part of why the change has been slow until now, I think, is the innovation dilemma. The established publishers have very little internal incentive to change their business model, and only respond to external pressures. We saw this with the Open Access movement putting external pressure on the paywall model, and I think we need something similar with the article. We need to push to do research differently, give people easy ways to change their research habits, and make it worthwhile. I like how Brian Nosek talks about change; he mentions making it possible as the first thing to do. I don’t think we’ve made micropublishing, as you go publishing, or modular publishing (whatever we want to call it) practical as of yet. In that sense, Octopus and Hypergraph are the first to bring this possibility into reality, even if some of the ideas are around forty years old.

CHJH: This is going to be a long process, which means we need resources to stay around to make this change. How does Octopus plan on staying around financially? Are you looking to receive contributions from institutions, researchers, publishers?

AF: Just to respond first to your comment above about ‘making it possible’ as the first step: exactly! Although I talked about Octopus a little when I first had the idea, I still feel I can’t really shout from the rooftops until, as you say, there’s something there that practically works and is a real alternative. Just bring on that day!

Now to the really problematic issue, as you say: money and sustainability.

I’ve been scraping Octopus together on a shoe-string, just to be able to get a proof-of-principle. I’m not technically-skilled myself, so I’ve had to pay professionals (and some have generously donated their time for free). That’s possible from awards, donations and my own resources in the short term. In the long term, what I’m planning is first of all to design the system to minimise running costs (expensive human elements). Then, I’d hope to be able to run it off ‘contributions’ - as you nicely put it - from institutions. I’m thinking either government and funders and/or academic institutions. When you think how much they currently pay to commercial publishers (hundreds and hundreds of millions of pounds per year), I am confident that Octopus would be an extremely worthwhile small investment as an alternative. What’s important to me is the principle that the primary research record is free to publish in and free to read.

But that’s only looking at it as a means of providing access to scientific research. Actually what Octopus aims to do is fundamentally improve the whole of scientific research - and to my mind that’s worth almost any amount of money.

AF: I’m interested to hear your thoughts around Hypergraph’s sustainability, and also what you think are the important differences between Octopus and Hypergraph. Do we have any? Should we merge?!

CHJH: The current system is indeed much too expensive for what it brings. I remember having to check a typesetted manuscript for newly introduced mistakes! I think the value proposition is complex, but with much room for improvement.

At the moment, Hypergraph is 100% dependent on the Shuttleworth Foundation and we have an operating budget of circa $275,000 a year. We can run for at most another two years with just this funding (2022-08-31). This would be the scenario if we’d have $0 revenue. We’ve been thinking about ways to generate our own revenue every day, plus how to do so ethically. We want to be sustainable but not become like Elsevier in the process.

To that end, we’re introducing supporting memberships for Liberate Science, to have a cooperative process of deciding what’s important to implement in Hypergraph or in other Liberate Science projects. Subsequently, we will also ask supporting members whether they  consent to specific business models. We have a lot of ideas regarding text- and data mining, research evaluation and planning, and look forward to building Liberate Science with our community. It is like a society membership, but with more opportunity for active participation.

As to the differences between Octopus and Hypergraph, the most crucial differences to me are (1) standardisation of the research process and (2) the nature of the infrastructure.

First, the standardisation of the research process. In Octopus, the order and length of a research journey is predefined, and when I tried to upload data (a CSV file), I couldn’t find a way. In Hypergraph, the researcher decides which steps, which files, which order, and the length of a research process. It is messier, but provides more flexibility.

Second, the nature of the infrastructure for Octopus is centralised, whereas Hypergraph is distributed and peer-to-peer. This means that the Octopus infrastructure depends on the entity that controls it, whereas we at Liberate Science actually cannot control it. When Cambridge University Press was asked by the Chinese government to take down some content critical to them, they could. Octopus technically could as well, although I know you wouldn’t want to. In Hypergraph, it cannot happen by us.

CHJH: Am I wrong? What do you think about when you think of Hypergraph and Octopus? Are these differences reconcilable (even if that conversation probably needs to take place outside a blogpost)?

Wow - I dream of that kind of funding! Congratulations!

I think you’re right about those core differences. Defining the ‘order’ that research is published is, for me, important because it stops people leaving stuff out. I can’t think of any good reason why someone would be able to publish, say, an analysis without the data/results being available too. Similarly publishing data/results without the method/protocol being available and so on.

It also helps findability. I have real issues with platforms - and there are several out there - which accept many ‘types of publication’ but which are really just big ‘bins’ into which these publications get thrown. Although search engines are increasingly good, there’re just far too many publications ‘out there’ in the world (of very variable qualities) and the amounts of overlap and wasted resources in constantly repeating both research and the ‘introductory’ bits reviewing what is already known at the start of every paper are just mindblowing. So, I think having a structure to the publication platform, as well as a universality of it (everything in one place) could be hugely beneficial.

As for infrastructure - that’s an interesting one. I don’t think I really understand how Hypergraph’s ‘distributed’ nature works, then. Octopus also has a distributed back end design, but that’s just for practical reasons (using libraries’ servers as a system of mirrored back-end databases to ensure capacity and reliable back-ups). Octopus also only aims to be the central hub - not to be the source of storage for large amounts of data: we expect people to use specialised data repositories, code repositories, etc for most non-text publications. Octopus would merely host the DOI/URL and some description.

To that end, Hypergraph material could just as easily be part of the Octopus network: researchers would simply post the DOI/URL of their Hypergraph-published work into Octopus (along with the necessary meta-data). Maybe that’s the best way for us to ‘merge’ in the short term: to develop tools that make it ‘one-click’ to add material from one to the other.

(Although I need to get my skates on and get Octopus ready for launch as I’ve fallen behind you now!)

CHJH: Thank you Alex 😊 The Shuttleworth Foundation is one of the few places with the right conditions for a project like this --- venture capital funding would have not been a good fit!

Findability is key indeed, to prevent the infrastructure from becoming one big blob. We use Wikidata to structure the research steps and make it easier to find things of interest (data, code, theory, etc.). I agree with the big bins as well --- I’ve had this issue when people link out to OSF projects and it takes me fifteen minutes to figure out which file I actually need to open. Everybody uses different projects structures and it can be highly confusing. In Hypergraph, researchers have to indicate one main file per step, like the writing or the data file, and the rest automatically becomes supporting materials (e.g., codebooks).

It is interesting to see how much we agree and yet how we different our perspectives are. It also highlights how excited I am to see both our projects mature and see change happen. There’s much work to be done and I am happy to see us on the same side of it!

This was a conversation between Alexandra Freeman and Chris Hartgerink. They are the founders of Octopus and Hypergraph, projects that reimagine research publishing to be more continuous and modular than the research article.

Join us on our open journey!