Chris Hartgerink, the founder of Liberate Science, discusses why and how they integrated ROR into the modular publishing platform ResearchEquals for author affiliations in user profiles and Crossref DOIs and explains why they live streamed all eight hours of the work.
“Integrating ROR just makes it very easy to match all these authors to the same institution. And it also lets you get all the metadata related to an institution so that you can display institutions visually and add more information, like when the institution was founded and where it’s located, all this information that you wouldn’t get from just a raw string.”
“So when people add an affiliation, they all use ROR. I have not heard any feedback that people can’t find their institution. Sometimes it might be under a slightly different name than they expect, especially if there’s a bilingual name or if it’s known under different names.”
“A nice thing about ROR is that it is living, breathing – with the continual releases, there’s new organizations being added, making it more worthwhile without us doing anything. As the data get richer and the usage gets richer, I think that’s really where the added value is.”
– Chris Hartgerink, Founder, Liberate Science
Hello, and thank you for participating in this case study interview. Can you tell us your name, title and organization?
Hi, everyone, I’m Chris Hartgerink from Liberate Science. I’m the CEO and founder there, and before I was a practicing researcher in meta-research.
And tell us about Liberate Science. When was it founded? What does it do? What makes it special?
Well, Liberate Science was founded back in 2019 when I left academia. I wanted to keep working on improving research, but not within the academic institution, because I tried that, and even though it might work for some people, it definitely didn’t work for me. I always felt like I was working towards not having a loss instead of towards wins. What’s different about Liberate Science really is that it tries to stay as independent as possible so that we can grow serious alternatives to all aspects of doing research. Right now our focus is on a serious alternative to the big publishers and an alternative to the article format. And so we built ResearchEquals, which was the first modular publishing platform. If you like that, it’s quite special.
Wonderful. What are these alternative formats in scholarly publishing?
Alternative formats are really nice, because they vary in both size and content. Of course, the book is the largest entity of publishing, and the article is a smaller one. One alternative format is the micropublication, which is a small article, but it is still an article. Then there’s modular publishing, which can be as big as a preprint, but it can be as small as a conference abstract. And then there’s also nanopublications, which are the smallest entity I know, where it’s just the statement that a certain variable positively predicts another variable, and that is then one publication.
Alternative formats are not necessarily paper-based. The nice thing about them is that they’re creating diversity in the publishing space. We’re used to reading text, but why not also have podcasts be part of the scholarly record? And videos, and study materials, and blogs? We’re really seeing much more diversity in the outputs and that increased diversity in what gets published will be good for the assessment of researchers as well.
Fascinating. So when and where did you personally first hear about ROR?
I heard about ROR for the first time when it was still in development. In following along over time, it just made sense to me: ROR is to organizations what ORCID is to individual researchers. Of course we want to know where people come from, what their affiliation is, and have that be entirely unambiguous. And so I followed along on GitHub a bit and saw some of the discussion happening there. I really like this community-based curation model. As with any open source work, it always takes a bit of time. But I would say I’ve known about ROR since 2019.
We calculate ROR’s official launch date as January 2019, which means that ROR’s fifth anniversary is coming up in January 2024.
Yes, there will be events! We’re beginning to plan. Any suggestions? What would you like to see happen at ROR’s fifth anniversary?
I would be very interested in some creative celebrations, something that really marks the moment, because I think you and the team have achieved a lot in those five years. And I think that’s worth literally celebrating. So maybe a virtual party?
I love it. Sounds amazing.
There are all these artists and drag shows and performances on Zoom. I don’t know whether they still do it.
At Crossref, we once had an origami lesson on Zoom at an all-staff meeting, which was wonderful. Although that was maybe more meditative than celebratory. And speaking of creativity, I so admired your live streaming of your integration of ROR into ResearchEquals! I thought that was something that more people should emulate. And it was so useful for us to see. Can you tell us about that, and about why you decided to integrate ROR into ResearchEquals?
ResearchEquals is a modular publishing platform, so we have a bit of a different situation from regular manuscript processing. With us, every researcher or author can make their own profile. And it’s not that they create an account and then go through a whole submission process and only then somebody else says, “Okay, now it gets published.” On ResearchEquals, you as the author or authors get to say, “Okay, now this becomes public.” It’s a bit more like a repository in that sense.
One of the things that came up was “What metadata do people want to add to their profile?” We started out with things like author names, their pronouns, their website, and a bio, but one of the things that was coming up was “How can we discover other people?” Authors also started saying, “We have a certain affiliation, and we want to be able to add that.” So then I started looking into that, and I remembered ROR! I was already inclined to integrate ROR anyway.
This way, we have the option in the future to also say, “Well, I’m not interested in specific authors, but I’m interested in specific organizations.” Integrating ROR just makes it very easy to match all these authors to the same institution. And it also lets you get all the metadata related to an institution so that you can display institutions visually and add more information, like when the institution was founded and where it’s located, all this information that you wouldn’t get from just a raw string.
Plus, there’s the risk, as you also know, that I might fill out my affiliation ever so slightly differently than someone else would. ROR was also a cost-effective way to implement affiliations. So all in all, there were many pluses in the pro column for ROR.
And why did you decide to live stream it?
The live streaming was actually very fun. I was thinking about integrating ROR, and then I thought, “Okay, but I have so many other things on my plate. How do I actually plan this properly?” It was a challenge to myself. It was four weeks, four two-hour slots, where I said, “Well, I’m just going to plan those two-hour slots for every week, and then I’m going to get it as far as I can.” Even if nobody watched, the live streaming would help me to plan and to have this experience of rubber duck programming, where you try to explain to your rubber ducky what you’re actually doing. It ended up being incredibly helpful. In the live stream itself I got some feedback that helped me along the way and made me look at things differently.
And so yeah, the idea was for self planning, but also to take the open source aspects of ResearchEquals to the next level. Or maybe not the next level, because other people have done this, but I thought this would be a fun exercise to try. And I was surprised how many people actually ended up watching the stream. I mean, it wasn’t thousands, of course, but I think one of the videos actually got around a hundred views over time. Not very long views, but still.
Right! I thought they were fascinating, of course. I learned a lot from them. And there were moments during the live streams where I kept wanting to help you – and then not doing it, to be honest! Because I thought, “No, I want to see if our documentation is good enough so that Chris can figure it out, and if they can’t, then I’m going to go and change it.” So I did that several times, actually, after watching you struggle a little. I apologize for not helping you when I could have, just because it was so useful to see how our documentation could be improved.
Well, you’re off the hook.
Thank you. And how has that integration been? Do you find organizations that are not in ROR that people are entering?
We ended up implementing it in such a way that people can only enter organizations from ROR. We include the ROR ID in our database and not the raw strings, because otherwise we’d have this issue still, which is the exact one we’re trying to solve, of needing to disambiguate the organizations. So when people add an affiliation, they all use ROR. I have not heard any feedback that people can’t find their institution. Sometimes it might be under a slightly different name than they expect, especially if there’s a bilingual name or if it’s known under different names.
One of the things we have heard is a limitation, because we only had eight hours to implement, is that at this point we only allow one affiliation at any given time. People have said, you know, “I have multiple affiliations, I would like to add them,” so we’ll be adding that at some point. It’s always a matter of resources. The other thing that we’re going to be adding are dates for each affiliation so people can also keep track of their past affiliations in their profile. That way it’s easy to see how many people were at these organizations over time, so if you’re not interested in the person per se, you can track how many people are using ResearchEquals at the organizational level. Of course, that doesn’t go into metadata, but that’s something we want to offer to institutions.
Great. And are you using the ROR API or the ROR data dump?
For ingestion, we use the ROR API, simply to make it as up to date as possible. We also store organizational information whenever people add it in, so it’s more efficient. I think that’s been very helpful. Of course we try to keep the queries to a minimum, so it’s only when people are searching that ROR gets asked “Hey, does this exist?” And then we store that information in our own database, and we may periodically double-check whether everything is still up to date.
And do you send those ROR IDs out to any other systems, like DOI registration agencies such as Crossref? Or do they just stay in your system?
Whenever I have an affiliation in my account, if I publish a module on ResearchEquals, it immediately gets added to my DOI metadata at that time. So if I switched my affiliation, job-wise, and I switched it in ResearchEquals, then at that point all the new DOIs would also have the new affiliation in there. All the old ones, of course, would have the old one.
What we don’t yet do is do the backlog. If you’ve authored with ResearchEquals before, pre-ROR integration, you don’t have this info yet. If somebody is reading this and they want that updated, then we can always regenerate the metadata for the DOI.
Gotcha. What challenges did you run into as you were integrating ROR into your system in four two-hour blocks?
I think the main thing was simply learning some of the quirks around how to interface with ROR directly. I think that’s true for any API integration. I write my documentation differently from the way somebody else does, so you have to sort of learn the language a bit to know where to find things. The main thing that was confusing or was a difficulty was this idea that the URL is the ID. I thought it was going to be just the unique string part.
That was a bit of a mindset shift. But beyond that, I think it was fairly straightforward. And most of the time, I was actually busy trying to fit all the design elements in the right place and make sure that when people click here and there that it works.
The interface design is a big issue, because there’s lots of different ways to implement ROR in terms of UI.
Yeah. And actually I think after the first two sessions, so after four hours, the integration was pretty much complete. And the remaining four hours were spent simply, you know, moving stuff one pixel to the right to make sure that it displays properly.
I had actually forgotten this until you mentioned it, but one of the things that I changed after watching your live stream was the part of our documentation about the preferred form of the ROR ID. There’s actually a lot of discussion on the ORCID GitHub about why our preferred form is the entire URL, and one of the changes I made was to add that reasoning into the documentation. Essentially, our sense is that when the URL is used as the ID in metadata, it’s more likely to continue to resolve. And that’s the key thing that you want from a persistent identifier.
I also put in the documentation that while we think the entire URL is the best form for the ROR identifier, the ROR API does also recognize just the unique string. A ROR ID is
https://ror.org plus nine characters, and if you just use that nine-character string, that will work in the API.
I mean, I think it makes sense, because then ultimately, in the code, that means whenever we want to link out, we don’t need to add all of that. Yeah, I think in that sense, it’s very worthwhile.
In terms of difficulties, I think for me, it’s also always that there’s just so many systems to integrate with, and they all have their own quirks. If you have a big team, you could just say “You’re the Integration Specialist for this, and you for this.” But we’re a very small team, and actually I’m the one responsible for these integrations, so I need to remember the quirks of ROR, remember the quirks of ORCID, remember the quirks of Crossref and whatever other integrations we have. That’s always a challenge to keep space for all that in my head.
But, you know, if the documentation is clear, that improvement is also going to be helpful to me in the future, because I Google everything. Somebody once told me that the documentation is a reflection of the state of the community. Because if feedback doesn’t get taken up into the documentation, then that’s an indication that they’re not really listening.
Oh, you’re so right. I think I got up at 5am or something to watch your livestream, and the minute it ended, I thought, “Okay, this documentation is changing.” I absolutely believe that, that there really needs to be a virtuous circle of getting feedback from community and immediately putting that into the documentation. Whenever we get a support request, I think, “If one person has this question, and they’ve overcome all the barriers to finding out how to ask it, has gone to the trouble of starting an email or posting in the Slack or whatever, then clearly other people have that question, too.” Any time one person has an issue, it’s almost certainly the case that there are fifteen other people who have that issue who just left the page and never communicated with you about it. So yes, I completely agree about that. And again, thank you so much for all of the free user testing you gave us.
My pleasure. I was doing it anyway. So yes, that’s good. Another benefit of live streaming.
Yep. So what do you hope ROR does in the future? Do you have any feature requests or suggestions for future direction?
Well, I think for me, one of the things that I find very undervalued and that I’ve also come to enjoy a lot is maintenance. So in that sense, I don’t have any shiny new feature requests except for stability and reliability. I think that very often when something is maintained, a lot of people feel like it stagnates if there is not anything new. But I very much like the idea of ROR being a very reliable, stable service where I don’t have to worry about the integration breaking anytime soon, or if it were to break, that there’s sufficient time to handle it. So in that way, my biggest feature request is stability.
A nice thing about ROR is that it is living, breathing – with the continual releases, there’s new organizations being added, making it more worthwhile without us doing anything. As the data get richer and the usage gets richer, I think that’s really where the added value is. I’ve seen other infrastructure providers try to create too many projects, which sometimes is a disservice to the core infrastructure. So I think that would be the primary thing. If anything, I guess the ease with which people can submit for community curation or participate in assessing requests is something to continuously evaluate.
Yep. Good feedback. I think one of the issues is that we’re specifically looking for people from Africa and Asia at the moment. I think if we recruited only from the US and Europe, we would get lots and lots of curators, but so much of what we need from our community curators right now is regional expertise. We need to recruit from Africa and Asia so that we can understand the organizations in that area. But yes, I agree, maybe we should work on clearer guidelines about that.
And if there’s anything we can do, if we can add a link out to say, “Hey, are you missing an institution? Go here to add it.” Or, you know, if people are interested to curate, that people can see that, always happy to discuss that. But I can imagine that having only European and North American universities, departments, etcetera, included is going to be really what’s going to be setting ROR apart.
What else would you like to say about ROR? Or about ResearchEquals? Or about Liberate Science?
I think my main thing is I want to have a big festival when you turn five, because it’s a momentous time. You turn five first, and then Liberate Science turns five later next year. So that’s gonna be fun. And to whomever is reading this, if you’re interested in modular publishing, check out ResearchEquals. If it’s too intimidating, we also do cohort trainings where we get a group of people together to just learn over several weeks, in predetermined hour-long slots, where people can also then you know, take the space and just say, okay, that’s when I’m going to learn this. And never outside of it.
“Time-boxing” is the phrase that I’ve heard for that. But for me, time-boxing only works when there’s an external constraint.
That’s the nice thing with live streaming.
Exactly! But that also takes a lot of courage. So maybe some time I will emulate your courage in doing that. Anyway, thank you so much for speaking with us!
Enjoy the rest of your day, and then have a nice rest of your week.