The scholarly community depends on a network of open identifier and metadata infrastructure. Content identifiers and contributor identifiers are foundational components of this network. But an additional component has long been missing from this picture: open, stakeholder-governed infrastructure for research organization identifiers and their associated metadata.
ROR launched in January 2019 with the specific aim of filling this gap. Our work is the culmination of several years of planning and collaboration across multiple organizations to develop a shared vision for an open registry of research organization identifiers.
ROR is now fully operational as a community-driven registry of open, sustainable, usable, and unique identifiers for every research organization in the world. ROR is a cross-organizational and multi-stakeholder initiative run by a small steering group in collaboration with a broad network of community advisors and supporters. The governing organizations are California Digital Library, Crossref, and DataCite. During ROR’s startup phase, these organizations have worked closely with additional Steering Committee members from around the globe, including Digital Science, who were instrumental in kickstarting the registry with seed data from their GRID database.
Reflecting and reinforcing core principles of openness, transparency, and sustainability for research infrastructure has been a central aim and focus for ROR since the very beginning. As we approach the two-year anniversary of ROR’s official launch, and as we are seeing greater adoption of ROR and implementation of open affiliation identifiers across the scholarly landscape, this is a key moment to evaluate how ROR has remained aligned to these principles and where it might need to focus and improve in the years ahead. The Principles of Open Scholarly Infrastructure (POSI) provide a useful framework for this evaluation.
POSI’s set of sixteen principles was proposed in 2015 with the aim of encouraging the scholarly community to more critically assess the infrastructure they rely on and hold each other to account. POSI has since been built upon and discussed by others in the community; last month, the principles were codified and now live on an independent website: https://openscholarlyinfrastructure.org.
In addition to codifying the principles, the website also aims to guide and encourage infrastructure organizations and initiatives to make a commitment to POSI. In recent weeks, both Crossref and Dryad have responded to this by posting self-assessments and a public statement of intent to comply with POSI. ROR is following their lead in preparing its own evaluation, which appears below.
The POSI framework focuses on three key areas for open infrastructure organizations and initiatives to garner the trust of the broader scholarly community: accountability (governance), funding (sustainability), and protection of community interests (insurance). As part of the self-assessment process, activities in these three areas are categorized as green (meets goal), yellow (partially meets goal, and/or effort to meet goal is in process), and red (does not meet goal) depending on how closely they align with POSI. To understand what this means in the context of ROR’s evaluation, it is important to bear a few considerations in mind:
If an item is marked as green, this doesn’t mean ROR does this perfectly. It simply means that we have internal processes that focus on this commitment and we have evidence that these processes have been working thus far. It also does not mean that ROR’s work in this area is done.
The principles must be considered as a whole; one is not more important than the other. It would be counterproductive to focus specifically on reaching one item’s “green” status if it would have a detrimental impact on another commitment.
Adherence to the principles is an ongoing process. Since ROR has taken an innovative approach of offering open infrastructure through a multi-organization collaboration instead of a standalone organization, some of these principles are less applicable than others. Furthermore, it is newly implemented infrastructure and it will take time to meet all of the principles in an ideal way. This is to be expected.
Coverage across the research enterprise – it is increasingly clear that research transcends disciplines, geography, institutions and stakeholders. The infrastructure that supports it needs to do the same.
As a global registry, ROR includes identifiers and metadata for close to 99,000 organizations in 217 countries. ROR covers many types of research organizations, including universities, government bodies, research facilities, medical facilities, funders, publishers, and research-producing companies. In this way, ROR IDs can be used to identify affiliations for many types of research activities.
ROR stakeholders also reflect a diversity of institutions and geographies. The organizations in ROR’s Steering Group represent five different countries (United States, United Kingdom, Germany, South Africa, and Japan) and a range of organization types, including academic library, infrastructure provider, research policy, and government funder. The ROR Community Advisory Group includes members from 16 countries and a similarly heterogeneous mix of organization types, including libraries, research institutes, infrastructure providers, funders, and government agencies.
Stakeholder Governed – a board-governed organisation drawn from the stakeholder community builds more confidence that the organisation will take decisions driven by community consensus and consideration of different interests.
As a multi-organization collaboration, ROR is by design not a standalone organization and consequently has no official mechanism for board governance. However, the ROR Steering Group, which advises the strategic direction of ROR, is drawn from the wider stakeholder community. This group will continue to evolve as adoption and uptake of ROR grows as well.
Non-discriminatory membership – we see the best option as an “opt-in” approach with a principle of non-discrimination where any stakeholder group may express an interest and should be welcome. The process of representation in day to day governance must also be inclusive with governance that reflects the demographics of the membership.
As an open, community-based initiative, ROR has no membership model by design but rather operates on the basis of self-selected participation. Anyone is welcome to join the ROR Community Advisory Group and take part in meetings and discussions about ROR’s progress and future directions. Participation is not determined by an individual or organization’s location, resources, industry, or beliefs.
Transparent operations – achieving trust in the selection of representatives to governance groups will be best achieved through transparent processes and operations in general (within the constraints of privacy laws).
ROR’s day-to-day operations are overseen by its governing organizations: California Digital Library, Crossref, and DataCite. These organizations have signed a memorandum of agreement that states their long-term resource commitments to ROR. This agreement has been discussed and approved by the leadership at each governing organization. As ROR approaches the start of its third year, we will continue to develop a framework to reflect the unique nature of ROR as a collaborative initiative, not an organization.
This is therefore an area for improvement. As ROR solidifies its operations, it plans to formalize and publicize operational documentation, such as financial and governance documents.
Cannot lobby – the community, not infrastructure organisations, should collectively drive regulatory change. An infrastructure organisation’s role is to provide a base for others to work on and should depend on its community to support the creation of a legislative environment that affects it.
ROR began as a collaborative effort across multiple organizations in the scholarly research and research infrastructure community and continues to be run as a multi-stakeholder collaboration. ROR’s community advisory groups meet regularly and provide a central focus and channel for ROR’s strategic directions. As a registry of open identifiers for affiliations, ROR provides infrastructure that can be implemented and iterated upon in a variety of settings. ROR does not lobby nor does ROR include regulatory change as part of its remit. ROR’s sole purpose is to offer data and tools to help its community of users solve problems and achieve their goals.
Living will – a powerful way to create trust is to publicly describe a plan addressing the condition under which an organisation would be wound down, how this would happen, and how any ongoing assets could be archived and preserved when passed to a successor organisation. Any such organisation would need to honour this same set of principles.
A signed agreement between ROR’s governing organizations describes how ROR would be shut down and how responsibilities for coordinating ROR could be transferred to successors, if applicable. This plan includes how the ROR team would handle remaining assets to ensure long-term preservation, access, and stewardship of the registry. These assets would include code repositories, public data dumps, cash funds, and technical infrastructure. ROR data will always be open and available under a CC0 waiver. Transfer of responsibilities for management of the registry will not impact the openness and availability of ROR data.
Formal incentives to fulfil mission & wind-down – infrastructures exist for a specific purpose and that purpose can be radically simplified or even rendered unnecessary by technological or social change. If it is possible the organisation (and staff) should have direct incentives to deliver on the mission and wind down.
As a relatively small and focused effort, ROR is already lean by design. By deliberately operating as an initiative and not forming an organization, ROR is more nimble and won’t face a scenario in which an organization has to be dissolved. ROR identifiers will continue to be needed for as long as affiliation data is a core aspect of research infrastructure. The mission of ROR will only go away when this need goes away.
Time-limited funds are used only for time-limited activities – day to day operations should be supported by day to day sustainable revenue sources. Grant dependency for funding operations makes them fragile and more easily distracted from building core infrastructure.
ROR is designed to be operated and sustained through a mix of funding sources. These include:
- In-kind contributions by ROR’s governing organizations
- Contributory investments by community stakeholders
- Grant funding for discrete project work
- Optional paid service fees for organizations that require additional technical support, (any such fee would be based on services and not on access to ROR data, described in more detail below)
ROR has a planned dependency on community fundraising and grants for two years, through the end of 2022. It is our plan to have this window to to shore up plans and technical infrastructure to support an optional service fee model geared towards power users of the registry.
Goal to generate surplus – organisations which define sustainability based merely on recovering costs are brittle and stagnant. It is not enough to merely survive, it has to be able to adapt and change. To weather economic, social and technological volatility, they need financial resources beyond immediate operating costs.
ROR is still in its startup phase and prides itself on running operations that are as lean as possible. During the startup phase, ROR’s governing organizations contribute in-kind support to maintain the resources needed to keep ROR running. Since its inception the banking and funds management of the ROR project has remained in sequestered financial reporting structures. The ROR budget is managed by the leadership of each governing organization and all accounts are easily auditable.
ROR understands that to maintain community trust, we must innovate and iterate over time. We also understand that innovation requires investment of additional resources. With financial reporting structures in place, ROR is well positioned to build a firm financial footing. ROR has explored and will continue to explore ways of building a surplus to maintain stability and ensure continuous innovation.
Goal to create contingency fund to support operations for 12 months – a high priority should be generating a contingency fund that can support a complete, orderly wind down (12 months in most cases). This fund should be separate from those allocated to covering operating risk and investment in development.
ROR’s aim is to provide a centralized curated registry for research organizations that is updated regularly and made openly available to the public on a regular basis so that if and when the ROR team is no longer able to maintain its current workflow the community will be able to make uses of the data because public data dumps will be available. Additional infrastructure overlaid on top of the data is handled by ROR’s governing organizations, who are bound to a memorandum of agreement that commits them to contributing in-kind resources to ROR. This agreement has also outlined how winding down and/or handing off would be coordinated within a specified timeframe of 120 days. ROR will look into further securing this contingency with further contractual obligations coordinated in the future.
Mission-consistent revenue generation – potential revenue sources should be considered for consistency with the organisational mission and not run counter to the aims of the organisation. For instance…
ROR currently plans to implement an optional paid service tier in 2022 geared towards power users and/or those that require unique technical support. This will be developed with the commitment that the core ROR dataset will remain freely and openly available.
Ensuring the full availability of the ROR dataset over the long term is a core value. Any revenue sources that ROR explores in the future will not interfere with this aim.
Revenue based on services, not data – data related to the running of the research enterprise should be a community property. Appropriate revenue sources might include value-added services, consulting, API Service Level Agreements or membership fees. The fees associated with any future paid ROR tier will be focused on optional value-added services, such as a separate API Service Level Agreement or custom data dumps.
ROR data will remain open and free in the spirit of making it available as a community property.
Open source – All software required to run the infrastructure should be available under an open source license. This does not include other software that may be involved with running the organisation.
ROR is a community project as well as a software project. All of ROR’s code and software processes are openly managed on Github and the development roadmap and prioritization are discussed in regular communications and community calls. Our goal is to maintain openness throughout the technical product management process. As part of this, all code is published openly on Github under a fully permissible MIT License. Whenever possible we leverage open source components and we work hard to ensure that our documentation allows other projects to leverage our CC0 data file, open API, and other tools.
Open data (within constraints of privacy laws) – For an infrastructure to be forked it will be necessary to replicate all relevant data. The CC0 waiver is best practice in making data legally available. Privacy and data protection laws will limit the extent to which this is possible
At the most fundamental level, ROR’s core purpose is to deliver a fully open, CC0 registry of identifiers and metadata for research organizations. We maintain the data file on Github and in a public data dump with a citable DOI.
Available data (within constraints of privacy laws) – It is not enough that the data be made “open” if there is not a practical way to actually obtain it. Underlying data should be made easily available via periodic data dumps.
As stated above, ROR code is freely and openly available on Github under the terms of a MIT License. The ROR dataset can be accessed on Github as well as via a public data dump that ROR releases upon every update to the registry (at present, these updates occur approximately every 3-4 months). The ROR API is publicly and openly available at https://api.ror.org/organizations.
Patent non-assertion – The organisation should commit to a patent non-assertion covenant. The organisation may obtain patents to protect its own operations, but not use them to prevent the community from replicating the infrastructure.
We value ourselves as a fully open and public registry of factual information about research organizations. The metadata and curatorial value that we contribute to the ROR registry are not considered the intellectual property of any ROR-related entity. As facts, information stored in the registry, by its nature, cannot be patented. In addition, ROR fully asserts no ownership of the information by making it available to the public under a CC0 1.0 public domain dedication.