Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Please note that all times are shown in the time zone of the conference. The current conference time is: 19th Oct 2021, 02:49:31pm UTC

Filter by Track or Type of Session 
Session Overview
Date: Monday, 07/June/2021
8:00am - 9:20amWorkshop: Lights and shadows in integrating and supporting ORCID in repository platforms

Alicia Fatima Gomez1, Magdalena Andrae1, Paloma Marin-Arraiza2

1TU Wien, Austria; 2ORCID, Spain

Persistent identifiers are definitely crucial for a metadata-driven approach to repository design. ORCID disambiguates researchers and contributors, and connects them with their research activities. This includes employment affiliations, research outputs, funding, peer review activities, research resources, society membership, distinctions and other scholarly infrastructure. Repository systems integrations with ORCID add visibility to repository content and their authors, facilitate collaboration and networking, and help organizations with institutional reporting systems and national assessment programs.

The aim of this interactive workshop is to present not only benefits, but also challenges and difficulties on the integration of ORCID within institutional repositories.

In order to achieve this, this workshop is organised into 3 parts: It starts with two short presentations about the essentials of ORCID integrations, supplemented/illustrated by some examples. Next the group will split into different breakout groups to allow participants to discuss concrete questions, so that they can brainstorm and contribute with their experiences and insights. Finally, ideas will be shared and the opportunities and limitations of integrating ORCID in repositories will be summarized.

8:00am - 10:50amWorkshop: Brokerage Event Towards a FAIR Compliant Commons in the ASREN Region

Yousef Torman1, Abdullahi Behi Hussein2, Ahmed Siyad2, Alwaleed K. Alkhaja3, Anwaar Al Kandari4, Behailu Korma5, Daryl M. Grenz6, Helena Cousijn7, Hicham Boutracheh8, Margareth Gfrerer5, Mohamed Ali Ahmed2, Raed Al-Zoubi1, Rawia F. Radi9, Roberto Barbera10, Saida Messaoudi11, Yassar Hanna Braik12

1ASREN Arab States Research and Education, Jordan, Hashemite Kingdom of; 2Somali Research and Education Network - SomaliREN; 3Qatar National Library - QNL; 4Kuwait Technical College - KTC; 5Higher Education Strategy Center; 6King Abdullah University for Science and Technology - KAUST; 7Datacite; 8Moroccan Institute of Scientific and Technical Information - IMIST; 9Islamic University of Gaza (IUG); 10University of Catania; 11Institut National des Sciences et Technologies de la Mer - INSTM; 12Syrian Virtual University - SVU

The proposed event frames in the landscape of actions carried out in Africa to promote FAIR principles and Open Science commons, such as digital repositories and persistent identifiers both for researchers and their research outputs. We aim at bringing in the same virtual place solution seekers, solution providers, policy makers and end users to trigger new collaborations through an original and innovative “try before you buy” approach.

The event is proposed to last 3 hours and it is divided in three parts: a workshop, a brokerage session, and a hands-on tutorial.

Although the proposed event is mainly targeting the ASREN region, it already includes contributions from Europe as well as from other parts of Africa and, if the proposal will be accepted, proponents will act to involve as many other countries as possible due to their involvement in AfricaConnect3, LIBSENSE and other related initiative.

Workshop page with the agenda and information for the Hands-on tutorial:

9:30am - 10:50amWorkshop: How repositories can increase their FAIR share

Marjan Grootveld1, Ilona von Stein1, Linas Čepinskas1, Patricia Herterich2, Joy Davidson2

1DANS - Data Archiving and Networked Services, The Netherlands; 2Digital Curation Centre

To better support wider sharing and reuse of research data, many organisations and research groups are developing strategies to foster a FAIR data culture - i.e., one where data are findable, accessible, interoperable and reusable. Repositories play a central role in enabling FAIR data practice and open scholarship. During last year’s Open Repositories conference, the European Commission funded FAIRsFAIR project shared a draft transition programme which is intended to support repositories on their journeys to become more FAIR-enabling by developing guidance and sharing examples of good practice in relation to supporting the production and use of FAIR data. In 2021’s follow-up workshop we share lessons learned in a more intimate collaboration with ten repositories and introduce two online tools for FAIR data assessment. This is the starting point for a discussion about the practical support that repositories may need on their FAIR journey. At the end of this 90-minute workshop for repository managers, research data librarians and data stewards, attendees will have an overview of the FAIRsFAIR materials available to support them in preparing for repository certification as well as the tools for FAIR data assessment provided by the project.

1:00pm - 2:20pmWorkshop: Introduction to Fedora 6.0

David Wilcox


Fedora 6.0, the next major version of the software, is quickly approaching a production release. This workshop will provide an overview of the software and basic concepts, examples of deployments, and an overview and demonstration of the core features, with a particular focus on new features in version 6.0. We will also discuss the product roadmap and ways to get involved with the Fedora community.

This is a technical workshop pitched at an introductory level, so no prior Fedora experience is required. General knowledge of the role and functionality of repositories would be beneficial. Attendees who wish to participate in the optional hands-on sections will need to access an online sandbox via a URL that will be provided ahead of the workshop.

1:00pm - 2:20pmWorkshop: Making repositories part of your digital strategy: experience from the Samvera Community

Chris Awre1, Karen Cariani2, Nabeela Jaffer3, Alicia Morris4

1University of Hull, United Kingdom; 2GBH; 3University of Michigan; 4Tufts University

The aim of this workshop is to provide a space where senior staff involved in strategic planning within their organisations can explore how digital repositories can contribute a key component in the implementation of a digital strategy. Furthermore, the workshop will look at how engagement with, and participation in, an external open-source community can be a part of that contribution. The workshop will use experience from Partners and Adopters within the Samvera Community to provide use cases relevant to these themes. Attendees are encouraged to make use of this workshop to clarify their own needs, discover how Partners and Adopters of the Samvera Community have addressed these needs, and explore how they have aligned their own repository developments with local strategic planning. Alongside delivery of content within the workshop there will be a focus on discussion and Q&A to help identify how to address strategic needs through sharing of experiences.

1:00pm - 3:50pmWorkshop: Getting Started with DSpace 7.0: Basic Training

Tim Donohue1, Art Lowel2, Andrea Bollini3

1LYRASIS, United States of America; 2Atmire, Belgium; 34Science, Italy

DSpace 7.0 is a major step in the evolution of the DSpace platform and repositories in general. While retaining its ease-of-use, out-of-the-box goals, DSpace 7.0 features a brand new, client-side, responsive user interface (built on Angular), a full-featured, self-describing REST API, and a powerful new configurable object model (featuring entities with relationships).

This 3-hour workshop will provide basic training on DSpace 7.0, due to be released in Q1 or Q2 of 2021. Attendees will learn about the installation / upgrade process and new configuration options. All new features will be discussed or demonstrated (including the new public, submitter and administrative user interfaces and new REST API). The workshop will conclude with step-by-step examples of branding/theming the new user interface (which could be followed for a “hands on” tutorial).

The workshop will wrap up with a brief overview of new features to look forward to in 7.1, 7.2, etc.

2:30pm - 3:50pmWorkshop: Europeana Data Model (EDM) Workflows in Archipelago

Diego A Pino Navarro, Allison K Lund

Metropolitan New York Library Council, United States of America

During this workshop, we will introduce the audience to Archipelago’s general architecture and unique, flexible approach to metadata and media. We will show live demonstrations of Archipelago instances where data and metadata values are being used/reused in a variety of commonly used standards and schemas (including IIIF V3, Dublin Core, and MODS). Using an Archipelago instance prepared with an EDM structured metadata webform and display template, and populated with freely available Europeana-sourced (consented and attributed) example digital objects and collections, we will demonstrate adding new content to the repository using the supplied webform and viewing/reviewing the metadata and media for the newly created digital object(s). We will then demonstrate adding an additional property or set of properties from an EDM core class to the webform and metadata display template. Then we will amend/edit the metadata record for the same digital object with a value(s) for the newly added property(ies), and review the updated digital object’s metadata record and display page. We will then invite audience participants to interact within the same Archipelago instance to complete the same two exercises demonstrated. We will conclude with a brief discussion/question and answer session related to the Archipelago demonstration and EDM data modeling exercises.

2:30pm - 3:50pmWorkshop: Machine-Actionable DMPs in the Repository: Motivation and Implementation

Taylor Mudd1, Richard Forrest2

1Haplo, United Kingdom; 2Cayuse

Data Management Plans (DMPs) are required increasingly to be submitted alongside ethical approval and funding applications for research projects, requiring researchers to consider what data will be produced as part of their research and how that will be managed and stored going forwards. These are typically managed in separate siloed systems, and stored in unstructured formats (eg. PDFs). Recording a DMP in a structured “machine-actionable” format enables systems to report and act on the planning information held within these documents.

In this session we will examine how DMPs can be structured in a machine-actionable way, and explore some of the benefits of managing them in the repository alongside dataset deposit and management features. This will include discussion of the challenges in adapting institutional and funder forms to map to a machine-actionable data model, and the use of the RDA Common Standard for machine-actionable DMPs. The implementation of machine-actionable DMPs in Haplo Repository will be used as a case study.

We will look at “quick wins” for attendees looking to implement maDMPs in their institutions, as well as consider some of the possible longer term benefits this approach enables.

Date: Tuesday, 08/June/2021
1:00pm - 1:55pmWelcome & opening keynote: Sir Jeremy Farrar
2:00pm - 2:45pmIdeas challenge introduction
The Ideas Challenge is a long-running Open Repositories tradition, where teams form during the conference to come up with great ideas to solve repository challenges or lead the way toward new innovations. In the past, teams have had just the 3 days of the conference to formulate their idea and pitch it to the audience and judges during the closing session. Last year, we changed up the format and kicked-off a year-long ideas challenge, with the goal of allowing teams to not just come up with an idea, but actually take steps to implement it. We'll share some results from 2020 during this session and introduce the 2021 ideas challenge, which will be another year-long event. Those interested in the Ideas Challenge should join the Ideas Challenge Slack channel
3:00pm - 3:55pmPanel: Speaking up and speaking out - who will shape the narrative for OA repositories?

Kathleen Shearer1, Eloy Rodrigues1,2, Johan Rooryck3,4, Dominique Bibini1

1COAR, International; 2University of Minho; 3Leiden University; 4cOAlition S

In the last several months, there have been a number of misleading statements about open access repositories made by several publisher groups. These statements and blog posts seem to be part of a coordinated strategy to diminish the role of OA repositories in favour of the gold (and for many publishers) APC route for policy compliance. The aim is to create a narrative that gold open access is the only “legitimate” route for OA, and inculcate these misconceptions within the research community. COAR contends that the repository community should be following these events closely, and work together with other stakeholder communities - researchers, universities and funders - to define and elevate our own narrative, which associates repositories with the values of inclusion, diversity, trust, and innovation. This interactive panel will bring together panelists from the university, funder, and repository community to discuss how we can work together as a community to counteract this publisher narrative and be more proactive in defining the role and value of repositories in the future of scholarly communications and open science.

4:00pm - 4:55pmPresentations session 1

The Carolina Digital Repository During COVID-19: Responding to a Pandemic

Rebekah Kati

University of North Carolina at Chapel Hill, United States of America

The COVID-19 pandemic has intensified conversations about widespread open access to scholarly literature. Institutional repositories such as the Carolina Digital Repository (CDR) at the University of North Carolina at Chapel Hill (UNC-CH) can play a key part in the dissemination of scholarly research. Access to coronavirus research was particularly important for the CDR, as UNC-CH is one of the leading institutions in the world for coronavirus research. The pandemic also had local impacts to our work. In March 2020, the University of North Carolina at Chapel Hill (UNC-CH) moved to online learning in response to rising coronavirus cases. Due to this change, many UNC-CH Libraries staff members and student workers needed work that could be done remotely. In this presentation, I will describe several projects which we undertook this year to increase access and dissemination of research, and to provide UNC Libraries staff with work which could be completed remotely and asynchronously. For each project, I will share the team’s successes, failures, lessons learned and plans for the future. I will also describe enhancements the team made to our Hyrax-based repository to facilitate the success of these projects.

CTDA in Context

Greg Colati1, Michael Kemezis1, Kayla Hinkson-Grant2

1University of Connecticut, United States of America; 2Mt. Holyoke College

CTDA In Context ( is a program designed to foster diversity, equity, and inclusion in the preserved historical record of Connecticut through the collections of the Connecticut Digital Archive.

There is a growing misconception that if a community or population cannot be found in digital form then collections about that community must not exist. Researchers, especially lifelong learners and K12 students, assume that all relevant information exists online.

Since the CTDA does not own any collections itself and can only offer content contributed by its member institutions we are constrained by the collecting policies and interests of our membership. CTDA in Context addresses this issue through three interrelated objectives that uses the CTDA’s position as a statewide organization to affect the content of the collections we steward and the diversity of the digital historical record in Connecticut:

Diversify the CTDA community to include memory organizations dedicated to documenting the activities of currently underrepresented groups and topics;

Educate current content managers about new descriptive and collection management practices;

Help members identify new collections, previously digitized, born digital, and/or physical, among their local communities to add to the CTDA to enrich the overall content in the repository.

Connected in Science: How arXiv facilitates global interactions during the pandemic and beyond

Eleonora Presani

arXiv, United States of America

In early 2020, the mutual impact of COVID-19 and OA repositories like arXiv on each other was unknown. Soon, we realized that the pace of submissions to arXiv were not just holding steady – they were increasing. This presented multiple challenges. First, a wide range of readers would now be seeking COVID papers on arXiv. Second, acceptance of papers to arXiv relies on a network of 190 moderators, and arXiv did not have moderators with expertise in coronaviruses. Third, the pandemic’s uncertainty prompted arXiv to consider how to maintain services if a high proportion of moderators or staff were unable to work. Fourth, although arXiv moderators had always worked remotely across the globe, the moderation tools revealed serious limitations. In this presentation, arXiv’s executive director, Eleonora Presani, will share the solutions -- technical and otherwise – to these challenges, including the development of a new interface for moderator engagement. The ways in which arXiv met these challenges provide a model for moving forward beyond the pandemic.

4:00pm - 4:55pmPresentations session 2

Fedora Software and Community Update: All Aboard for Fedora 6.0

David Wilcox, Arran Griffith


For the past several years, the Fedora community has prioritized alignment with linked data best practices and modern web standards which guided our previous development efforts. But through extensive community feedback and engagement we have shifted our efforts in new directions. The design and development of Fedora 6.0 has been guided by three principles: improve the digital preservation feature set, support migrations from all previous versions of the software, and improve performance and scale. This presentation will provide an overview of Fedora 6.0 and an update on the current state of development, highlighting areas of interest such as the migration utilities and documentation created to support users in the transition from past versions to 6.0 as well as the robust testing measures implemented allowing users direct access to crucial performance metrics. Finally presenters will explain the post-production focus on adoption, support and elevated community engagement.

DSpace 7.0 : Coming to a DSpace near you

Tim Donohue

LYRASIS, United States of America

It’s been a long time coming and we’re glad to say 7.0 is finally here! This talk helps your organization prepare for the next evolution of DSpace platform: DSpace 7.0. What does it look like? How would I upgrade (or install it)? How easy is it to brand/theme? Why did this take so long (and how did we adapt, despite a pandemic)?

While retaining its ease-of-use, out-of-the-box goals, DSpace 7 introduces many major new features to existing users and the repository community in general. This talk will provide an overview of each.

Major new features include:

* A brand new, client-side, responsive user interface (built on Angular), with drag-and-drop submission process and improved usability. Theme it with just knowledge of CSS (Bootstrap) and HTML.

* A brand new, full-featured, self-describing REST API, which opens your DSpace data to advanced third-party integrations / tools.

* A powerful new Configurable Entities object model (i.e. typed Items with relationships between them), allowing for advanced integrations with external identifier systems (e.g. ORCID), current research information systems (CRIS), journal publishing systems, etc.

* Enhanced GDPR support, OpenAIREv4 support, and (coming in 7.2) alignment with COAR Next Generation Repositories recommendations (ResourceSync and Signposting).

Islandora Community Update: Islandora at Home

Danny Lamb1, Mark Jordan2

1Islandora Foundation, Canada; 2Simon Fraser University

Since the Islandora Foundation was first announced at Open Repositories in 2013, it has been our privilege to share a yearly update about our project and our community. 2020 was a unique year for this kind of distributed open-source project, and the community-driven approach to development and maintenance of Islandora that has sustained the project for more than a decade had to change with the times as the entire Islandora community went completely virtual. We would like to share how our community adjusted workflows, communications, and some of the fundamental ways that we work together, to keep driving an open-source repository project through a global pandemic.

5:00pm - 6:00pmNetworking session
Date: Wednesday, 09/June/2021
8:00am - 8:55am24x7 session 1

Context is as important as content — the Bridge of Knowledge platform as a comprehensive ecosystem of research information

Piotr Krajewski, Aleksander Mroziński

Gdańsk University of Technology, Poland

In 2010, Tim Berners-Lee, the inventor of the World Wide Web and creator of Semantic Web, proposed the Linked Open Data (LOD) concept. The 5-star LOD scheme is a set of principles that make data available to everyone and ready for reuse and distribution. This presentation aims to show how the platform Bridge of Knowledge, which was developed by Gdansk University of Technology (GUT), fulfils LOD requirements. The platform consists of several services, including an institutional repository of GUT publications, an open data repository (Bridge of Data), an inventions module and a projects module. In addition, researchers can create profiles that include information about their scientific output, achievements and research activities. Processed data are retrieved mainly from internal GUT services and organized in ways that support contextual navigation in the system. Each object is described by with JSON-LD formatting, and semantic relationships among objects foster easy navigation from one object to another, allowing Bridge of Knowledge users to discover interlinked information. This web of connections among modules makes searching more effective, not only for humans but also for machines.

Helda Open Books - A repository-based service bringing sold-out monographs and textbooks back to life

Markku Roinila, Jussi Piipponen

Helsinki university library, Finland

We present a 2020 project to publish sold-out textbooks and other monographs in the Helsinki university open repository Helda (Dspace). The project is related to a larger open monograph initiative Helda Open Books which is a collection of open monographs in the university repository where books are given persistent identifiers, their visibility is promoted and the book pages in the repository are provided with social media sharing links and altmetric tools.

The background in the project is in the new acquisition policy of Helsinki University Library, emphasizing open availability of teaching materials. The problem has been that there are not enough library copies for all course students and ebooks are not necessarily available. Therefore the project strived to offer both new original open monographs for teaching as well as opening printed, but sold-out textbooks. We will concentrate on the latter objective and challenges concerning rights acquiring, scanning, publishing and metadata are discussed. Finally, we reflect on the success of the project, consider the resources required for it and report some future challenges.

8 Terabytes of Music to Explore with DSpace-GLAM and IIIF: the Digital Library of the Milan Conservatory

Marta Crippa2, Claudio Cortese1, Emilia Groppo1, Andrea Bollini1, Riccardo Fazio1, Francesco Pio Scognamiglio1

14Science, Italy; 2Conservatorio "G. Verdi" di Milano, Italy

A flexible and extensible data model to enhance the relationships between digital objects and a framework to share and compare images also coming from different repositories, these are the tools provided by the DSpace-GLAM open source platform that the Milan Conservatory is using to make its Digital Library a fundamental tool for music and musicological studies. The presentation will highlight how Conservatory’s scholars took advantage of DSpace-GLAM features to build a “cultural galaxy”, comprising a vast set of different types of musical bibliographic resources linked and made explorable by means of their relationships and of the International Interoperability Framework (IIIF) to share and analyse thoroughly all the components of such galaxy.

OpenDOAR’s repository assessment service (RAS)

Jennifer Sanchez-Davies


At Jisc we have been working on ways OpenDOAR can further extend its support to the repository community. It is well-known that repositories are key facilitators in research impact and knowledge dissemination for all. Key to this is the repository infrastructure and standards which are highly important in ensuring that repository content is discoverable and interoperable. With this in mind, providing tools to offer support and encourage good practice is a priority for OpenDOAR. As such, we have developed a new scalable and flexible infrastructure that can help repositories assess and seek guidance on international incentives and standards. This 24/7 presentation demonstrates the first iteration of our latest offering, the Repository Assessment Service (RAS).

The RAS enables repositories to self-validate their level of compliance with funder requirements. Repository managers can login to the RAS to carry out their repository self-assessment by answering a series of questions. Users are presented with scores to check their current status and receive helpful guidance on how to improve their score. In this presentation, we will take an exciting tour through the RAS’s infrastructure, key features and how we envisage it being embedded into repositories’ operational workflow.

Sherpa Services – global collaboration for an enhanced and improved service

Karen Jackson

Jisc, United Kingdom

Jisc’s Sherpa Services team are working to enhance, increase and improve the coverage and accuracy of our data at an international level – and we are finding that the best way to ensure full and accurate records for any given place is to be working with the local experts. We are currently running a number of collaborations with groups around the world to input, edit and curate data relating to journals and publishers in their country – in this session we aim to give an overview of some of this work and show how it is helping to add to our services.

Publications Router: populating repositories automatically

Steve Byford1, Adam Rehin1, Andrea Bollini2, L. Andrea Pascarelli2, Susanna Mornati2

1Jisc, UK; 24Science, Italy

Institutions around the world have wrestled with how to be as efficient as possible in making their researchers’ articles openly available on their repositories. They use a variety of systems and workflows to try to achieve this.

For UK institutions, Jisc’s Publications Router service works with publishers to capture articles, match them to their authors’ institutions and deliver them directly into the relevant repositories. It is now interoperable with a wider range of systems that institutions use, including research information management systems and CRISs, as well as repositories.

Collaboration with 4Science, who have developed the relevant patches for the latest DSpace versions, means that Router can now deliver the full richness of RIOXX metadata fields (as well as full-text articles) into recent versions of DSpace. Parallel work includes compliance with the Research Excellence Framework (REF) in the UK, demonstrating the value of institutional repositories for research evaluation as well.

Although currently aimed at UK institutions, Jisc would like to enable Router to serve institutions in additional territories. We’ll look at some of the challenges and opportunities this will entail.

Compliance without complaints

Taylor Mudd

Haplo, United Kingdom

Historically repository managers have found supporting researchers to comply with a myriad of conflicting funder policies a challenge. In this presentation, Haplo will describe how they redesigned their repository system to make it easier for researchers to comply and easier for repository managers to ensure compliance. Working with repository managers and researchers in several institutions in the UK, they redesigned the submission process and interface researchers used to deposit. The overall affect resulted in a process which made compliance easy, appealing and, crucially, achievable without any additional effort on behalf of researchers.

8:00am - 8:55amPresentations session 3

Development of National-level Institutional Repository Cloud Service for Open Science

Masaharu Hayashi, Yutaka Hayashi, Makoto Asaoka, Masashi Kawai, Yasuyuki Minamiyama, Kazutsuna Yamaji

National Institute of Informatics, Japan

The purpose of this study was to build a national-level institutional repository cloud service that has the extensibility of repository function and the flexibility of system operation for realizing open access to research data as a starting point of open science. WEKO3, a new repository software using Invenio framework, was developed to achieve this purpose. In particular, flexible metadata management function and customizable workflow functions were developed to provide the extensibility of repository functions. In a cloud service development using WEKO3, we applied the microservice architecture and container architecture to realize flexible system operation. As a result, WEKO3 and its cloud service realized a national-level institutional repository cloud service for open science.

A shared governance and a sustainable funding model for HAL, the French National Open Archive

Bénédicte Kuntziger-Planche, Agnès Magron, Nathalie Fargier

Centre pour la Communication Scientifique Directe (CCSD) - CNRS, France

HAL is the French national open archive ( supported by the French ministry of higher education, research and Innovation, and 3 research organisations CNRS (, INRIA ( and INRAÉ ( HAL provides portals for institutions allowing them managing their scientific productions. Nowadays, HAL hosts a network of 124 institutional archives for universities, research institutions and higher education schools, sharing their common publications (

The growth of HAL led to think of a new economic and a new governance model, in order to take into account the diversity of needs and to ensure the sustainability of the archive. All partners of the archive where involved in the process, leading to a new price list for institutional portals, based on the number of institutional beneficiaries of the portals, and a shared governance model for CCSD, the service unit that manages HAL. These new models will be implemented in the coming months.

Leveraging Open Services to Enhance Institutional Research Tracking Workflows

Yasmeen Alsaedi, Daryl Grenz, Mohamed Baessa

King Abdullah University of Science and Technology (KAUST), Saudi Arabia

In order to support the implementation of an institutional open access policy, adopted in 2014, the KAUST University Library developed publications tracking processes with the additional aim of making the institutional repository a reliable source for bibliographic information about the university’s research outputs.2 The code underlying these processes has now been updated and combined into an open source software application, the Institutional Research Tracking Service (IRTS), with a public version planned for release at prior to the conference. This presentation will briefly introduce the structure and functionality of the IRTS application and walk through the different workflows it supports, before focusing on recent enhancements to the service, especially our use of the beta version of the new Sherpa Romeo API6 and of the Unpaywall API11. We will also present an assessment of the initial effect of these improvements on the workflow, and the impact they have had on deposit rates of full text materials to the repository. Finally, we will give examples of how this service reinforces the interest of other university stakeholders in reusing the research information from the repository for purposes such as annual reporting, research evaluation, and maintenance of up-to-date publication lists on websites.

9:00am - 9:55amPresentations session 4

Open for All? Addressing the Need for Regulating Access in the Disciplinary Repository PsychArchives

Lea Gerhards, Peter Weiland, Roland Ramthun, Christiane Baier

Leibniz Institute for Psychology (ZPID), Germany

Accommodating a variety of digital research objects, including articles, preprints, research data, code, supplements, and multimedia objects, PsychArchives, the disciplinary repository for psychological science, requires a differentiated sharing level concept regulating access to and usage rights of repository content. Data sharing according to the FAIR principles is called for by more and more funding institutions, such as the European Research Council (ERC), and is increasingly a prerequisite for the publication of scientific articles in professional journals. At the same time, researchers are expected to follow recommendations on data sharing, e.g. by the German Psychological Society (DGPs), which – taking into account European and national data protection laws – postulate that a restriction of access to research data is sometimes warranted. Providing a solution for this complex situation, PsychArchives offers a number of different sharing levels, ranging from open and immediate access to more restrictive access categories. Each sharing level is equipped with specific licensing options. Allowing for the allocation of sharing levels on the level of individual files, PsychArchives presents a flexible and granular solution for archiving and making accessible psychological research output in line with best practice standards, recommendations by scientific societies as well as ethical and data protection requirements.

4TU.ResearchData: Building the community

Connie Clare

4TU.ResearchData, Netherlands

4TU.ResearchData is an international data repository for science, engineering and design. Established as an initiative of three Dutch technical universities; TU Delft, TU/Eindhoven and the University of Twente, 4TU.ResearchData is a cross-institutional collaboration that supports researchers with making their data findable, accessible, interoperable and reusable (FAIR).

The recent migration of content to Figshare provided researchers with a repository that boasts new features, including integration with GitHub and publication of confidential data under embargo. Despite these technical advancements, 4TU.ResearchData appreciates that providing ‘good’ technical infrastructure is not sufficient to make data FAIR and drive a culture change toward Open Science.

Community building is essential to provide researchers with discipline-specific support and guidance. The goal of the 4TU.ResearchData community, which includes data stewards, data supporters and researchers from the partner institutions, is to provide an inclusive space for exchanging knowledge and best data sharing practices. Programming includes working groups, community calls, an online platform and blog to publicise achievements of community members. News and events are also shared in a monthly newsletter. Future programming includes launching calls for applications for the ‘FAIR Data Fund’, and the development of a global Fellowship Programme to advance 4TU.ResearchData’s mission and vision of FAIR data.

Shared repositories: building (multi-tenancy) repository services at the British Library

Sara Gould, Jenny Basford, Torsten Reimer

The British Library, United Kingdom

In 2018, a report by the Universities UK Open Access Repositories Working Group raised concerns about repository sustainability and proposed a range actions that could support open repositories. It also posed the question of whether the integration of separate institutional repositories and increased use of shared services might help to pool resources, reduce costs and increase visibility, citation and impact.

While shared repositories are still unusual, in November 2019, the British Library launched its shared open access repository. It brings together the research outputs of six cultural heritage organisations – available as separate repositories as well as a shared search interface at To our knowledge, it is the first example of a shared repository that takes advantage of the multi-tenancy features of the open source Samvera Hyku repository system. This paper discusses both the nature, challenge and benefits of developing a multi-tenant repository system, and the BL’s progress towards a shared repository service for a wider range of partner organisations. Based on the lessons learned so far, we will discuss the potential of open source-based, shared repositories to contribute to an open knowledge environment, and how this project already opened up previously closed content.

9:00am - 10:10amDeveloper track session 1

CAPE: A javascript clientside repository

Christopher James Gutteridge

University of Southampton, United Kingdom

At Southampton we used to use EPrints to create “microrepositories” which are basically using the repository as a way to interact with a research dataset. This is expensive and not very sustainable.

Sustainability of small research-output websites is an ongoing issue as many have complex and bespoke back end needs.

To address both of these we’ve developed a pure javascript repository system called CAPE which uses a JSON file for the metadata and field configuration. It’s a work in progress but other people may find either the tool or the ideas useful.

I would be giving a live tour of the tool and it’s features. I can prerecord. It would be based on this blog post

The IOI App: How and why to establish an institutional ORCID integration outside of your repository platform

Yasmeen Alsaedi, Daryl Grenz, Mohamed Baessa

King Abdullah University of Science and Technology (KAUST), Saudi Arabia

For many institutions, their ability to integrate with ORCID is hindered by the absence of robust integration support within their repository software. While progress has been made in some platforms, overall the enthusiasm for ORCID in the repository community has not translated into fully integrated solutions easily adopted by large numbers of universities, often leaving them frustrated in their efforts to pursue system integrations with ORCID.

Our own initial integration between our DSpace repository and ORCID provided minimum functionality and relied on a manual process for transfer of information into the repository. Recently, we have revamped our tool into a more fully featured institutional ORCID integration directly connected to DSpace via REST API and released it as an open source software application.

This presentation will introduce the structure and functionality of the IOI application as well as demonstrate how it can be configured for reuse by other institutions with a variety of needs. We will also explain why we foresee continuing to maintain an institutional ORCID integration as a service closely connected to, but separate from, the main institutional repository platform.

Developing COrDa: The COmmunity Orcid Dashboard

Adam Vials Moore1, Monica Duke1, Kirsty Wallis2, Owen Stephens3

1Jisc, United Kingdom; 2University College London; 3Owen Stephens Consulting Ltd

The Community ORCID Dashboard is a project to bring together multiple sources of institutional and open data around ORCID iD and rationalise them into one central reporting and visualisation offering.

Two features, 10 lines of code

Taylor Mudd

Haplo, United Kingdom

A well-designed plugin API speeds up development of new features for open source repositories and enables developers to deliver new functionality in a seamless user experience. During this demonstration the presenter will add two fully integrated features to a Haplo Repository application using only 10 lines of code.

Logical Item Filtering in DSpace

Kim Michael Shepherd, Pascal-Nicolas Becker

The Library Code GmbH

The ability to logically filter items in DSpace using Spring configuration introduces

powerful new ways to apply existing tools conditionally.

This presentation discusses a use case where items should only have a new DOI

registered under certain circumstances, eg: metadata matches a regular expression,

or some number of bitstreams are present.

To support this functionality, The Library Code developed a new spring-based item

logic filtering service inspired by the filtering configuration used in XOAI.

The DSpace logical item filter supports complete boolean algebra.

This service allows boolean logical statements of varying complexity to define new

filters which can then be used by other services to test items, and act accordingly on

the result.

The result of an item test is ultimately determined by a set of simple Java classes that

inspect an item and return a boolean value.

These conditions are easy to add and extend, making any kind of item-level test


In this presentation we review the item filter code, show the result of wiring a filter

service into the DSpace DOI provider, and provide demonstrations of filtering under

various conditions. Other potential applications of logical item filtering will also be


12:00pm - 12:55pmMinute Madness

Status of students’ graduation (masters) theses in repositories of six European universities

Danijel Gudelj1, Ljiljana Poljak2, Vicko Tomić1, Mirta Matošić2, Matko Marušić1, Ana Marušić3

1ST-OPEN, University of Split, Split, Croatia; 2Split University Library, Split, Croatia; 3Department of Research in Biomedicine and Health, University of Split School of Medicine, Split, Croatia

We studied the status of students’ graduation (masters) theses in repositories of six European universities, which are all partners in The European University of the Seas (SEA-EU) Association. Our interest in that subject stems from our work with the University of Split overlay+ journal ST-OPEN, which, transforms students’ graduation theses in research reports and publishes them as scholarly articles in the journal. ST-OPEN thus contributes to the personal advancement of young graduates, increases the quality of teaching at the graduate levels, and increases production and visibility of the University. We searched the official repository of universities of Brest (France), Cadiz (Spain), Gdansk (Poland), Kiel (Germany), Malta (Malta), and Split (Croatia). The first findings are not encouraging. Besides their organizational differences, the six universities differ immensely in their policies of posting graduation theses in their repositories. They have very different internal organizations, including the type and number of schools, institutes and educational programs. Also, they differ overall and within themselves in the openness of access to graduation theses.

Repositories and publishers: AgEcon Search forging new relationships

Linda Eells, Julie Kelly

University of Minnesota-United States of America

Many repositories do not work with publishers on a regular basis but that is not true for AgEcon Search, the subject repository in agricultural and applied economics. With journal articles as the most common document type and over 40 professional societies among the 340 groups that contribute material, we have formed a number of different relationships with publishers. The most common is that AgEcon Search hosts material initially created and made available by a small (often professional society) publisher. In one uniquely beneficial case, we receive and upload pdfs of each individual article from a large commercial publisher after an embargo period. In an opposite, negative situation, we were asked by a publisher to remove years of articles after a society moved to a large commercial publisher and all older material was placed behind the publisher’s paywall. In recent, new collaborations, we assist publishers with creating DOIs, and are negotiating to serve as the pre-print server for one society journal that contracts with a commercial publisher. We encourage other repositories to consider how they might work together with publishers to make research materials more widely available, including institutional repositories whose faculty serve on national or international society journal editorial boards.

A Framework for an Arabic Terminology Management System (TMS) using Artificial Intelligence

Sherine Mahmoud Eid

Bibliotheca Alexandrina, Egypt

Many translation projects in the Arab world were not implemented, or were delayed, due to manual translation processes and/or budgetary reasons. The poster proposes the use of Artificial intelligence (AI) in the form of Arabic machine translation (MT) to develop an Arabic Terminology Management System (TMS), followed by human Post-editing of machine translation output. The system shall ease the translation process and generate output products, such as glossaries and taxonomies.

Advancing Hyku Project Update

Ellen Catz Ramsey1, Brian Hole2, Ilkay Holt3

1University of Virginia, United States of America; 2Ubiquity Press; 3The British Library, United Kingdom

Join a Year Two project update for the Advancing Hyku collaborative project, which aims to support the growth of green open access through institutional repositories. The deliverables of the project are to introduce significant structural improvements and new features to the Samvera Community's Hyku platform. The project partners are University of Virginia Library, Ubiquity Press and the British Library, with funding from Arcadia, a charitable fund of philanthropists Lisbet Rausing and Peter Baldwin. The project began October 2019 and is scheduled to conclude with a rollout of the Advanced Hyku platform community-wide after February 2022.

SANDIMS: South African National Geophysical Data and Instrumentation Management System

Kate Niemantinga, Pierre J Cilliers

South African National Space Agency, South Africa

The archiving and dissemination of geophysical research data collected over Southern Africa, Antarctic research base (SANAE IV) and at the South African high latitude observatories on Marion Island and Gough Island, has until recently been fragmented and inaccessible to international researchers.

At the Space Science Directorate of the South African National Space Agency (SANSA) in Hermanus we have implemented a scientific data portal called the South African National Geophysical Data and Instrumentation Management System (SANDIMS) which for the first time makes the geophysical data collected and used by SANSA available through a single data portal.

Our aim is that the system will meet national and international obligations and expectations, as well as raise the standard of South African research. The system’s unique database will contain high-quality data from areas in space that, potentially, could supply information for unanswered scientific questions and enhance scientific development.

The paper will share insights from various topics including Data Policies, Licensing, Data Discovery and showcase tools for researchers and practitioners.

Geodisy -- New geospatial data discovery for Canadian research data

Eugene Barsky

University of British Columbia, Canada

With the rapid proliferation of research data, it is vital to create innovative tools for data discovery and access. This is the goal of Portage’s Geodisy project, an open-source spatial discovery tool for Canadian interdisciplinary open research data. Geodisy provides a map-based search available alongside the Federated Research Data Repository (FRDR), a national discovery layer indexing over 70 Canadian open repositories. Geodisy is intended for users with diverse experience levels and subject interests and is designed to be accessible for those without GIS knowledge. Data and metadata are extracted from the native repositories and are discoverable based on their location, and individual geospatial files are previewed as visual overlays. For any research that relates to geospatial location, this tool provides a new and useful form of visual discovery.

Understanding IR Impact: What Do Users Do with Our Stuff?

Wendy Walker

University of Montana, Missoula, United States of America

Like many institutions, the University of Montana’s institutional strategy includes a focus on the impact of faculty and student work(1). It is relatively easy to use quantitative data such as download counts(2), location information(3), and citations(4,5) to help demonstrate the IR’s reach and impact; however, there are limits to using these numbers(5), especially when attempting to understand impact outside an academic context. Could knowing more about how or why users use IR content address our own curiosity, help us understand the IR’s impact more completely, and help us contextualize and describe its impact more effectively than we can by reporting quantitative data alone? In April 2018 we added a link to the cover pages that are included with most of the downloadable items in our IR. The link directs users to a form where they can tell us how access to the IR item benefitted them. To date, we have received just over 200 responses. While we now know more about the wide range of uses of our IR content, we are still determining if/how to utilize this information to help demonstrate impact. Our initial evaluation has raised a host of new, useful questions that will inform next steps.

Atrium Repository: diffusion of cardiology knowledge

Francijane Oliveira da Conceição1,2, Cyntia Mendes Aguiar1,2, Renato Cerceau1,3, Jorge Zavaleta4, Cristiane da Cruz Lamas1,2,5

1Instituto Nacional de Cardiologia - INC, Brazil; 2Fundação Oswaldo Cruz - FIOCRUZ, Brazil; 3Universidade Estadual do Rio de Janeiro , Brazil; 4Universidade Federal do Rio de Janeiro - UFRJ, Brazil; 5Universidade UNIGRANRIO, Brazil

The Instituto Nacional de Cardiologia (INC), a public tertiary referral cardiology centre located in Rio de Janeiro, Brazil, aims to structure and organize all available institutional data into one single space that is easily accessible to its workers and also to the wider scientific community and general public. At present, these data are dispersed in different sources, with little acessibility and low safety .

International experience suggests the most adequate way to preserve the institutional memory and to diffuse knowledge is through a structure called repository. Technological tools make the creation of a repository easier nowadays. Our Project aims to structure the ATRIUM, INC’s institutional repository .

The ATRIUM will store all relevant institutional data and allow members of the public safe and timely access to it.

RCIN - Digital Repository of Scientific Institutes in Poland

Błażej Betański, Tomasz Parkoła, Natalia Jeszke

Poznan Supercomputing and Networking Center, Poland

This poster presents RCIN -a digital repository developed in OZwRCIN project, being an example of cooperation between thematically diverse scientific institutes across Poland. RCIN is developed to give access to and promote digital resources coming from various Polish scientific institutes. The poster covers information on technical solutions applied to increase accessibility of data, .e.g various metadata schemes like Dublin Core and Darwin Core, accessibility in terms of user interface as well as large-scale digitisation and underlying infrastructure (software and hardware). RCIN is built as a large-scale repository that holds digital assets from all participating institutions and provides multiple ways to access the data, including a web portal for the general public, multiple smaller institutional repositories that give access to resources provided by specific institutions and various APIs.

Changes and impacts of the COVID-19 pandemic response on Arca

Tiago Martins da Costa Ferreira, Claudete Fernandes de Queiroz, Luciana Danielli de Araujo, Raphael Belchior Rodrigues, Éder de Almeida Freyre, Andréa Gonçalves do Nascimento, Angelo José Moreira Silva, Catarina Barreto Malheiro Pereira, Rita de Cassia da Silva, Leonardo Simonini Ferreira

Fundação Oswaldo Cruz. Instituto de Comunicação e Informação Científica e Tecnológica em Saúde

This work presents the changes and impacts on Arca - Oswaldo Cruz Foundation’s (Fiocruz) Institutional Repository, of the COVID-19 response, which has been terrorizing the world. Due to the recognition from the World Health Organization, the importance of the scientific research of the Institution and the technological progress related to vaccines development, supplies and innovative treatments have become crucial. In this epidemiological scenario, including scientific documents and research material of this Institution related to COVID-19 on Arca was deemed critical and of large relevance. It is important to note that the changes, adaptations and focused efforts for the pandemic response, especially with the reduced issuing time, are intended to increase visibility, promote health with public goods and provide access to the knowledge produced by Fiocruz to the whole society.

Repository Re(volution): AgEcon Search Goes Global

Linda Eells, Julie Kelly

University of Minnesota-Waite Library, United States of America

The AgEcon Search ( subject repository has developed over 25 years from a tiny local repository with 50 papers, into an international body of literature with 155,000 papers in agricultural and applied economics. Many of its most active contributing members are in parts of Africa and Asia that have long had difficulty getting their work out into the world (Kelly & Eells 2015; Ihli 2019). AgEcon Search contains free, full-text papers of many types covering agriculture, development, energy, natural resources, food security, and related areas. It receives over 15,000 visits every day from nearly every country in the world and is the premier resource in agricultural economics. We will discuss the evolution of the repository, including the decision to migrate from DSpace to TIND IR in 2017, and financial sustainability challenges. We will highlight how AgEcon Search enables researchers and practitioners to distribute their work to the world while serving as a free resource to all. Repositories have a unique role in helping those who have difficulty accessing – either as producers or as users – the more formal (and expensive) literature, so it is critical to maintain relevance and financial stability as we move into the future.

PeruCRIS: A National Research Information Infrastructure based on DSpace-CRIS

Cesar Olivares1, Francisco Talavera1, Abel del Carpio1, Susanna Mornati2, Andrea Bollini2, Claudio Cortese2, Corrado Lombardi2, Alfonso Maza3, Ana Puente de la Puebla3

1Concytec, Peru; 24Science, Italy; 3Semicrol, Spain

PeruCRIS is the Peruvian project for setting up and operating a National Information Network on Science, Technology and Technological Innovation. It is based on open source software and open standards, especifically as an extension of the already existing network of Peruvian open access repositories. The network is designed to be interoperable with CRIS systems and Open Access Repositories, and also to allow for direct submission into the National central hub (PeruCRIS Platform) from institutions and researchers. PeruCRIS Platform goes beyond mere aggregation, supporting data normalization, enrichment and curation for collected data, and direct editing in the Directorios. These roles are reserved for Concytec staff, and serve as a means for ensuring data completeness, accuracy and general quality. A particular CERIF profile and a set of controlled vocabularies are being developed in order to accommodate for particular information needs of national scope. An initial partial release of main directories is scheduled for Q2 2021.

Availability of open government data in the Maghreb countries

Elsayed Elsawy1, Ahmed Maher Khafaga Shehata2

1Sultan Qaboos University, Oman, Tanta University (Egypt); 2Sultan Qaboos University, Oman, Minia University (Egypt)

Recently, many Arab governments have adopted open government data policies, whereby web technologies are harnessed to provide access to government data. The study's problem lies in the fact that open data practices in the Arab world are relatively recent and are still in their nasal stages, making those practices the subject of constant criticism and evaluation to develop and improve them to be consistent with international practices. Moreover, there is no research carried out to identify the extent of the Maghreb Arab states' progress in this field. The government open data portals in the Maghreb countries (Algeria, Libya, Mauritania, Morocco, and Tunisia) are considered the gateway to studying the reality of open data services in those countries. This study explores the current practices used in preserving and sharing data in open data portals in the Maghreb countries and assessing whether its structure and organization of data on these portals are consistent with the goals for which these portals were established. The study sample included three open data portals on the Internet: the Arab Maghreb countries' portals (Tunisia, Algeria, and Morocco). In order to analyse the portals selected in the study sample, a list of 47 criteria was developed.

Sherpa Romeo – our roadmap

Karen Jackson, Jane Anders

Jisc, United Kingdom

This poster will present the Jisc Open Research team’s roadmap for the development and enhancement of Sherpa Romeo over the coming years, and invite feedback, comments and questions from the user community. We will be looking at the ways we plan to enhance the service, infrastructure and usability of Romeo, to ensure we are continuing to meet the needs of all our users throughout the changing OA policy landscape.

Discovery after migration

Ben Summers, Taylor Mudd

Haplo, United Kingdom

Institutions are often apprehensive when moving their repository to a new repository system, as this will affect hundreds of thousands of records collected over decades and the discoverability of their institution's research.

This poster will present statistics on item usage and discovery after repository migration, using data from an external independent repository monitoring service to show an increase in usage of up to 250% and explain how this can be achieved.

Introduce Impact-Pathways in a CRIS – support societal impact orientation in research projects and funding processes

Birge Michaela Wolf1, Doris Lange1, Thorsten Michaelis1, Andrea Moser1, Stefanie John1, Andreas Abecker2, Stefan Lossow2, Lucia Hahne2, Andrea Bollini3, Giuseppe Digilio3, Susanna Mornati3

1University of Kassel, Germany; 2Disy; 34Science

Societal challenges require research contributions to solve them. Accordingly, societal impact assessment is an object of increasing interest in publicly funded research. Some countries have build elaborated national systems, applied on the level of research institutions. The approach of the SynSICRIS project (Synergies for Societal Impact in Current Research Information Systems) focuses on societal impact creation and assessment in research projects. Therefore, a repository/CRIS system is being built with additional entities related to societal impact and functionalities to record the information during funding processes.

Our system is built upon the open source software DSpace-CRIS. The additional entities include process-oriented indicators that represent an increase in the likelihood of societal impact. The additional functionalities, allow planning, documenting and structuring contributions of a project to societal impact via interfaces to build impact pathways and working plans. The development built on a synthesis of existing approaches, participatory requirements analysis and agile software development.

Using such a system at the funding body enables to assess information related to societal impact without additional documentation burden for researchers, allows to manage sensitive project information and supports the dissemination, reusing and sharing of outputs and information tailored to actors in practice and society.

A Recommendation System for an Open Archive

Gulce Bal Bozkurt, Gozde Boztepe Karatas

Middle East Technical University, Turkey

In this study, we developed a recommender system for OpenMETU which is the open archive of Middle East Technical University. Our system recommends items by using a content-based approach. In the content-based approach, the properties of items are vital due to the recommendations are based on them. Our recommendation system is based on the author, abstract, title, and subject similarity. To calculate these similarities, first of all, we extract features of each item by using natural language processing algorithms such as TF-IDF and Universal sentence encoder. We use cosine distance to measure the similarity between two items. Later, we give different weights to each of the similarities to calculate the overall score. Our system recommends the most similar 5 items to the visitor who visits an item in our archive.

The Scholarship of The Ohio State University: Open for All

Maureen Walsh

The Ohio State University, United States of America

The Ohio State University Libraries promotes innovative research and creative expression and curates and preserves information essential for scholarship and learning. Making the research and scholarship of Ohio State’s faculty, staff, and students openly available allows us to live our land grant mission – sharing knowledge and culture with the people of Ohio, the nation, and the world.

This 24x7 will discuss current developments with The Ohio State University Libraries’ "Transforming the Scholarly Publishing Economy" strategic initiative. It will highlight Ohio State's Open Access agreements, the ongoing development of partnerships across campus and with our consortia and peers, early outcomes of our faculty, scholarly society, and publisher engagements, and thoughts towards future opportunities and challenges.

Interoperability Standards for Distributed Collaboration in InvenioRDM

Sara Gonzales1, Matthew B. Carson1, Guillaume Viger1, Lars Holm Nielsen2, Kristi L. Holmes1

1Northwestern University, United States of America; 2European Organization for Nuclear Research (CERN), Geneve, Switzerland

CERN (The European Organization for Nuclear Research) has collaborated for the past 2 years with a widely distributed international community on the development of InvenioRDM, an extensible turn-key repository solution. Differing metadata standards utilized for description and access across the globe have led to fertile discussions among team regarding the base data model to employ and field-specific controlled vocabularies. As discussions evinced the need to consider increasing stakeholder needs and requirements, it became clear that a dedicated group was needed to support systematic recording and discussion of partner requirements in order to reach decisions on the data model that could have a positive impact on all adopters. A metadata interest group was formed, with membership open to all project stakeholders, and regular meetings set to discuss user needs, field-by-field. Discussions in this group have helped to reduce the boundaries between distributed stakeholders and bolster the repository community as a whole through a joint, democratic effort open to all, including technical and non-technical participants.

Zenodo spam detection using neural networks

Pablo Panero

CERN, Switzerland

Nobody wants to get something unwanted, like spam. The increase of spam content has become a problem in our digital era, and therefore it also affects digital repositories. Hosting spam can have an impact on a service, i.e. the actual hardware costs of storing it, getting skewed usage statistics, including distribution of material that violates copyright, and, most importantly, serving undesired content to users.

Zenodo is a generalist research repository fostering open science practices. As the barrier for submissions is low, it is an easy target for spam. The repository’s staff has spent many hours manually detecting spam content, a process now assisted by an automated spam classification system, which still does not produce satisfactory results.

Improvements of this classifier were based on an in-depth study of Zenodo’s data, a descriptive analysis, and feature extraction to corroborate expert knowledge gathered over years by Zenodo’s staff, as well as on a literature review of related topics such as spam classification in emails.

Several types of neural network models were tested, displaying promising results for future integration. However, as the false positive rate is still unacceptable, Random Forest classifiers still prevail over neural network models.

DataCORE - Grappling With Big Files and Big Problems

James Halliday, Brian Keese

Indiana University, United States of America

IU DataCORE is a brand new Samvera-based repository at Indiana University focused exclusively on research data. This poster will show how the system works, including detailing how data flows in and out and some of the challenges we overcame in implementing it. The changing landscape of handling large data and how to move it around has necessitated some updates to our workflows that we will detail.

Is it interoperability or is it integration?

Paul L.S. Stokes, Tamsin Burland, John Kaye, Howard Williams

Jisc, United Kingdom

For those not of a technical bent, there is a great deal of confusion surrounding the terms 'interoperability' and 'integration', especially when it comes to the exchange of information in 'black-box' systems. The average depositor of data doesn't want (or need) to know how data and meta data move around a system as long as what they put in eventually gets to where it should be, in the form it should be and can be seen by the appropriate people. However, a basic understanding of the similarities and differences between 'interoperable' systems and 'integrated' systems and the pros and cons of each approach when it comes to depositing, preserving and discovering data will help users and administrators make informed decisions when it comes to the specification of data management systems, and will help inform their day-to-day data management practices.

This poster is intended to highlight those similarities and differences, pros and cons.

Interoperability Intersection: Partnerships in Open Science Infrastructure

Eric Olson

Center for Open Science, United States of America

Enabling interoperability throughout the research lifecycle often aligns with or is a key component of the missions of open source tool providers. Even before the introduction of the FAIR framework, open source tools and infrastructure have emphasized opportunities to connect the systems and workflows that researchers rely on so that research communication can be faster, more efficient, and more secure.

OSF, like many of our friends in the open science infrastructure space, is strengthened both as a technical tool and as a ‘community of communities’ by integrations with other platforms and services. In this poster we will describe our product philosophy that emphasizes interoperability and meeting researchers where they are, while also discussing our experiences with partnership building across outstanding organizations and providers in research. We will also visualize current and upcoming integrations that strengthen both OSF and partner tools while also making it much easier for researchers to manage and collaborate across the research data, planning, and outcomes.

“Precedented”: Public Health, Open Access Infrastructure, and Interrogating Power in Repository Debates

Michael Scott1, Kate Dohe2

1Hispanic American Periodicals Index, UCLA; 2University of Maryland Libraries

In January 2020, researchers released the initial DNA sequence for the COVID-19 virus on Twitter, bypassing both traditional publication models and institutional open infrastructure in the interest of expediency. The global pandemic has drawn open scientific publishing into the spotlight in the mainstream American press over the past year, and citizens outside the Western research community are gaining new exposure to longstanding scholarly communications debates about public health and open science. Unlike many aspects of the COVID-19 pandemic, however, this is not “unprecedented”--freely accessible health information was also the impetus for the open access movement in Latin America years ahead of comparable efforts in the United States, and it continues with great success today. However, discussions of these platforms in US-based publications have centered on questions of prestige and functionality, rather than reach and stability--concepts that are rooted in imperialist, English-first thinking. This poster will highlight the early history of open access repositories in Latin America, their focus on regionally-produced scientific research rather than institutional efforts, the current state of these projects, and apply a critical lens to Western discourse about these projects and their impact.

Bepress to DSpace in PJs: Migrating Two Open Repositories from Home

Julia Corrice, Chloe McLaren, Jim Del Rosso, Gail Steinhart

Cornell University Library, United States of America

In 2019, Cornell University Library (CUL) made the decision to migrate multiple institutional repositories from bepress into eCommons, its locally supported DSpace instance.

When migration project planning began in February of 2020, we could not have anticipated the impact the COVID-19 pandemic would soon have on our work. We'll share some background--how Cornell ended up with multiple institutional repositories and our rationale for consolidation, and then describe the work itself, with particular focus on the challenges and approaches we used to successfully complete this project during a pandemic, with all staff making an abrupt and unplanned transition to working remotely.

1:00pm - 2:25pmPanel: Repository Rodeo

Maureen Walsh1, Danny Brooke2, Rory McNicholl3, Tim Shearer4, Taylor Mudd5, Sara Gonzales6, Arran Griffith7, Heather Greer Klein8

1The Ohio State University, United States of America; 2Harvard University, United States of America; 3University of London, United Kingdom; 4University of North Carolina at Chapel Hill, United States of America; 5Haplo, United Kingdom; 6Northwestern University, United States of America; 7Islandora Foundation, Canada; 8Samvera, United States of America

The Repository Rodeo returns for another round of questions and answers! This popular panel, featured since Open Repositories 2016 in Dublin, offers a broad overview of the main repository platforms at Open Repositories and provides an opportunity for spirited discussion amongst panelists and virtual attendees. Join community representatives from Dataverse, DSpace, EPrints, Fedora, Haplo, Invenio, Islandora, and Samvera as we briefly explain what each of our repositories actually does. We'll also talk about the directions of our respective technical and community developments and related to the conference theme of “Open for all”, we’ll discuss the role of our repositories in supporting knowledge in the service of society, enabling local knowledge sharing, and supporting those beyond the conventional academic sphere.

This panel will be a great opportunity for newcomers to Open Repositories to get a crash course on the major repository options and meet representatives from each of their communities. After a brief presentation from each representative, we'll open the session up for questions from the virtual audience.

2:30pm - 2:55pmPanel: IRUS In Bloom: Expanded Openness for Institutional Repository Usage Statistics

Hannah Rosen1, Laura Wong2, Jim Ottaviani3

1LYRASIS, United States of America; 2Jisc, United Kingdom; 3University of Michigan, United States of American

The Jisc IRUS service (institutional Repository Usage Statistics) has existed for several years, consolidating COUNTER-conformant statistics from institutional repositories and helping to demonstrate their value and impact, both internally and in comparison with other institutions. However, over the last year, IRUS has been improved in various ways that expand the service’s ability to support open knowledge. The presentation will go over three areas of improvement.

First, Hannah Rosen will be discussing recent IRUS updates, which, among other things, has introduced COUNTER Release 5 conformance, removed barriers to viewing IR usage statistics and now facilitates international usage, sharing, and awareness. Then Laura Wong will be talking specifically about use cases surrounding knowledge sharing related to the COVID-19 pandemic within IRUS. Finally, Jim Ottaviani will be presenting on the challenges and benefits of using the new IRUS interface across multiple repositories in a single institution.

3:00pm - 3:55pm24x7 session 2

Elevating Open Data: Building an Accessible Environment for Data Stewardship in Research Libraries with CADRE

Jaci Wilkinson1, Jamie V. Wittenberg2, Patricia L. Mabry3, Valentin Pentchev4, Robert Van Rennes5, Ethan Fridmanski1

1Indiana University Libraries; 2University Libraries, University of Colorado Boulder; 3HealthPartners; 4Indiana University Network Science Institute; 5Big Ten Academic Alliance

Libraries that can afford to purchase big datasets often cannot provide infrastructure to store, secure, and maintain data--or provide a viable data-mining interface for users who are not proficient coders.

The Collaborative Archive & Data Research Environment (CADRE) aims to address these issues with a cloud-based infrastructure that integrates a shared repository of big bibliometric data into a science gateway with standardized text- and data-mining capabilities. CADRE facilitates collaboration and elevates open practices by developing an open source science gateway that provides access to a repository of open and non-consumptive datasets, GUI querying capabilities, shared data-analysis tools, and reproducible research. This community-built cyberinfrastructure is supported by 11 university libraries across two academic consortia.

Now in its beta phase, the CADRE project is in the final stages of a sustainability plan, ensuring a complete tiered-pricing plan and sustainability model will be available for interested researchers at the Open Repositories 2021 conference.

Using Research Profiles as a Service (RPaS) to Populate an Institutional Repository

Agnes Gambill

Appalachian State University, United States of America

Populating an institutional repository at any academic university can be a challenge. Doing so in the middle of a global health crisis adds further complexities. So, how can librarians and repository managers create momentum and interest in IRs during COVID-19?

One solution includes offering Research Profiles as a Service (RPaS). RPaS was developed at Appalachian State University to help faculty members build their online presence and increase their awareness and engagement with the university’s institutional repository. Response rates to RPaS service inquiries were dramatically more than response rates to article ingest submissions. Furthermore, RPaS requests led to more article ingest submissions than compared to traditional methods of IR outreach. This presentation will discuss the reason for creating RPaS and illustrate how and why it works.

Swallow Beyond the Repository: Development of a Custom Metadata Management Software

Tomasz Neugebauer, Francisco Berrizbeitia

Concordia University, Canada

Swallow is a custom-built open source software solution that facilitates a common need in digital collections projects: the aggregation of metadata describing digital objects of cultural or scientific significance held at many sources. The software was developed out of the goals of the SpokenWeb SSHRC Partnership Grant research network to digitize, process, describe, and aggregate the metadata of a diverse range of sound collections documenting literary and cultural activity in Canada since the 1950s. In this presentation, we introduce Swallow, the principles that drove its development, and briefly outline the context of documentary literary sound recordings and some recent developments with literary events moving online due to restrictions related to the COVID-19 pandemic.

Repository Migration: How it started, How it’s going

Nicholas Homenda, Juliet L. Hardesty

Indiana University Libraries, United States of America

Indiana University is in the process of migrating digital collections from Fedora 3 to Fedora 4. Since our 2018 presentation “The Ecosystem of Repository Migration,” we have learned that the services and application are even more front and center to repository migration than we originally thought. For us, this has looked like launching new Digital Collections and Archives Online services and contracting with a software development company, Notch8. We will share the lessons learned that starting simple does not just involve the least complicated digital objects, it also applies to keeping service and application upgrades to the minimum needed for a viable product.

Increasing the A in OA: How accessibility work in repositories should influence publisher agreements

Sadie Roosa

MIT Libraries, United States of America

The majority of publishers, even those making articles open through gold or green OA routes, are only making those articles open to a portion of the public: those who can read PDFs as they are displayed on a screen or printed on paper. As more and more academic libraries negotiate with publishers on agreements to transform scholarly publishing, we have an opportunity and responsibility to push for increased accessibility of open access content that publishers provide.

To highlight the impact agreements with publishers could have on the accessibility of open content in our institutional repositories, I’ll share results of an analysis of DSpace@MIT’s Open Access Articles collection. This analysis will show the accessibility status of content deposited both directly and indirectly from publishers between 2009 and 2020. I will then focus in on multiple changes that would benefit green OA, tying them in with other trends in publisher agreements, like auto-deposit and open licensing. I’ll lay out multiple hypothetical agreements and how each one would improve the numbers and percentages of accessible articles in our repository.

We’ve got a Digital Repository: now what do we do with it?

Kent Douglas Reynolds, Bianca Parisi, Cindy Rigg

Niagara College Canada, Canada

Until this pandemic, Niagara College’s digital repository was primarily an institutional archive. We thought of it as a place to put “things that needed a home.” Niagara College owns one instance of CORe, a multi-site Islandora repository, shared by seven Canadian colleges. Initially, Niagara College library began populating its repository with historical artifacts and other items of significance. The librarians reached out to various departments, yet despite great interest, little happened.

Then, the pandemic changed everything. The last day of in-person service at our library was March 13, 2020. Services like course reserves that required the physical presence of workers to digitize and manage resources for students and faculty were gone, and instructors were scrambling for online teaching and learning tools. We reached out to our faculty again offering the repository, this time, as a solution for teaching and learning during the pandemic. The result was rapid ingesting of learning resources. We all started thinking about our repository as a resource to fill the gaps in physical services. The pandemic provided us all with an opportunity to re-think our digital repository, re-purposing it to meet the sudden and urgent demand to support acadmic teaching and learning online.

3:00pm - 3:55pmPresentations session 5

Cataloging of Biological Datasets in AgDados - the Embrapa Data Repository

Marcia Izabel Fugisawa Souza, Antonio Nhani Junior, Poliana Fernanda Giachetto, Paula Regina Kuser-Falcao, Leandro Carrijo Cintra, Luiz Antonio Falaguasta Barbosa, Luiz Manoel Silva Cunha, Marcos Cezar Visoli, Tercia Zavaglia Torres

Embrapa Agricultural Informatics, Brazil

Within the scope of a pilot project to implement omic data management actions, conducted by the Multiuser Laboratory of Bioinformatics of the Brazilian Agricultural Research Corporation (Embrapa), and the Information Engineering Research Group (GPEI), from Embrapa Informática Agropecuária, the task of cataloging data sets in the Embrapa Data Repository (AgDados) was defined as a priority. The purpose of the reported study was to define and establish minimum rules for cataloging biological datasets in the AgDados, guided by international standards for descriptive cataloging; and compatible with the FAIR principles. In this process, four aspects were determined: minimum rules for describing datasets; metadata elements compatible with the Dataverse software; set of common attributes to guide the description of metadata elements; rules to guide the completion of each element, field of Citation Metadata and Life Science Metadata. These results compose a set of basic rules for cataloging datasets, and are intended to support the cataloger during the task of describing the data within the AgDados. The study presents an effective contribution to: organization and systematization of the main activities of cataloging biological datasets in AgDados; generation of qualified metadata in accordance with international standards of descriptive representation, and compatible with the FAIR principles.

Preserving Podcasts: An Institutional Repository Case Study

Erik A. Moore, Valerie M. Collins

University of Minnesota, United States of America

In response to the 2020 global pandemic, the University of Minnesota Archives sought to gather digital content documenting the public health crisis and institutional response to COVID-19. Staff identified university produced podcasts from several departments as information-rich contemporaneous content that was also at high risk of loss. Staff conducted a campus-wide survey to identify content and types of platforms to determine the steps to ingest podcasts into the institutional repository. The outcomes demonstrate that the use of an institutional repository to preserve podcast content provides many of the same benefits as for other types of traditional repository content. The inclusion of podcast media in IRs also demonstrates how moving beyond traditional academic content and formats helps to reach a broader, more community-focused audience and user base, increasing the repository impact with those outside of the academic context.

Hodgepodge or Showcase?

Gail McMillan

Virginia Tech, United States of America

Since the IR may serve the institution as a digital library, it should accurately reflect the institution’s scholarship and activities. But how can we determine whether this is the case? This study proposes selecting several microcosms, creating a controlled vocabulary of terms and phrases, searching the vocabulary at the institution-level and in the IR, and comparing the percentage of hits in each.

Assessing IRs from the perspective of its content is not a frame of reference generally used. This study compared three microcosms in the IR with the same microcosms in the institution at large--LGBTQ, Indigenous People, and Latinx. What percentage of similarity be considered a close enough correlation? While this is a review of one IR at a doctoral-granting land grant institution, a discussion of the methodology and results could lead to cross-IR analyses.

This presentation will describe developing the vocabularies, search strategies using the advantages and short-comings of Google and Solr, and what the data revealed about VTechWorks, the IR at Virginia Tech. Studying these microcosms within the IR also revealed the use or non-use of terms and phrases by certain author communities, e.g., graduate students through their ETDs and faculty through their published articles.

4:00pm - 4:55pm24x7 session 3

Navigating OA eBook usage data stakeholder interests to facilitate cross-platform data exchange

Christina Drummond

Educopia Institute, United States of America

Originating in 2015, the OA eBook Usage (OAeBU) Data Trust effort has brought over 100 individuals across five continents together to surface and address the issues that complicate the analysis and use of book usage metrics for decision making and open access advocacy worldwide. In its current pilot phase supported by The Andrew W. Mellon Foundation, the project has documented the complex OAeBU data supply chain, piloted open-source infrastructure for a data trust, and facilitated stakeholder led design thinking workshops to create detailed OAeBU personas and use cases. As it prepares for its operational launch, the project is developing governance and sustainability models to support the community infrastructure while meeting diverse public/private stakeholder needs and managing nuanced international data sharing regulations involving privacy, security and ethical third-party data use.

After a brief project introduction, Christina Drummond, the OAeBU Data Trust Program Officer, will note tensions that may complicate the ability for neutral, core community infrastructure to facilitate cross-platform public/private data exchange while providing data visualizations to target audiences. She will also note how the effort is navigating standards and process related opportunities that surfaced during the pilot. In closing, repository managers will learn how to join the effort.

DataCite Service Providers program: Bringing the power of DOIs to a system near you

Liz Krznarich

DataCite, United States of America

In order for repositories to leverage the power of DOIs, it’s essential that DOI registration is well-integrated into popular repository software platforms. After a series of community focus groups, DataCite launched its registered Service Providers program in July 2020 ( with the goals of ensuring best practice adoption, enabling 2-way communication between DataCite and system developers, and providing repositories with resources about DataCite integrations. This presentation will provide an overview of the DataCite Service Providers program, including information about how to become a registered service provider and how to find resources related to DataCite integrations in repository systems.

Reopening the Repository: Redeveloping ScholarSphere to support Open Access

Daniel Coughlin, Seth Erickson, Adam Wead

Penn State University, United States of America

Penn State released its institutional repository in 2012. After many years in production, we decided the only way to meet our new challenges was to completely rewrite the software. With the new version of ScholarSphere, we aimed to implement new features, use more sustainable frameworks, improve application architecture, modernize our infrastructure, and create opportunities to support open access. We spent just over one year in development and this presentation will focus on what we did, why we made the decisions we made, what went well and not-so well, and where we are heading in 2021 and beyond.

Sustaining Open Infrastructure Communities: Evaluating Fiscal and Administrative Service Options

Heather Greer Klein1, Rosalyn Metz2

1Samvera, United States of America; 2Emory University, United States of America

Over the past two years, the Samvera Community has implemented structures meant to ensure sustainability of the Community, these include: an elected governing body; a contribution model; and the hiring of a Community Manager. The Community’s existing fiscal sponsorship agreement was set to expire in mid-2021, and could not be renewed without a change to both the terms of the agreement and the rate charged for services. The Samvera Community recognized an opportunity to critically evaluate the Community’s fiscal and organizational needs and to explore the options available in both the library open infrastructure community as well as the wider open source software landscape.

This presentation will review the critical role of fiscal sustainability in open infrastructure communities; common models for this relationship; and the process Samvera used to evaluate fiscal and administrative needs against the options available in the open market.

We will also present how the newly selected fiscal sponsor for Samvera represents an innovative model that is becoming more common in the wider free and open source software ecosystem. This model could help other open infrastructure projects seeking to ensure the best fit for financial, legal, and community leadership sustainability.

An Engaged Campus Repository in Practice: The University of Minnesota’s Institutional Repository and the Pandemic Response

Erik A. Moore, Valerie M. Collins, Lisa R. Johnston

University of Minnesota, United States of America

Institutional repositories (IRs) provide more than open access to scholarly materials; they also provide timely and persistent access to community-focused resources serving as a “common good” for a publicly-engaged university promoting scholarship and research on their campus. This presentation will explore the fundamentals of what it means to provide repository services for public engagement and highlight strategies to incorporate the repository as a key component in community-engaged work. It will then provide an example based on the public engagement work at the University of Minnesota demonstrating how the institutional repository became a central tool for publicly-engaged offices to reach their communities in the COVID-19 pandemic response.

Taking (small) steps towards accessible repository content

Mariya Maistrovskaya

University of Toronto Libraries, Canada

In this brief overview we will share the steps and strategies the University of Toronto Libraries is taking towards improving content accessibility in its IR. The University of Toronto is the largest institution in Canada with a well established IR of 90,000+items, primarily in PDF format. Our strategies include an addition of the accessibility provision to the submission policy, accessibility check and remediation for mediated deposit submissions, and training for content creators. We hope that the resources, policies and procedures we have developed will help other repository managers along this path.

4:00pm - 4:55pmPresentations session 6

Notify - The Repository and Services Interoperability Project

Kathleen Shearer1, Martin Klein2, Paul Walk1

1COAR, International; 2Los Alamos National Laboratory, USA

COAR has been promoting a new and exciting future for scholarly communications. This vision was first outlined in the COAR Next Generation Repositories Initiative and further articulated in the Pubfair White Paper, which describes a distributed framework for open publishing services. In 2020, COAR published a generic technical model to enable the linking of preprints and other repository resources with external services, with an initial focus on peer review services. The technical model – which was developed based on a number of use cases provided by preprint servers, repositories, peer review services and overlay journals – applies a distributed, message-oriented approach based on W3C Linked Data Notifications (LDN). As a next step, in January 2021, COAR launched the Notify: The Repositories and Services Interoperability Project to assist early adopters in implementing a common and interoperable model that will support reviews and endorsements on distributed resources in repositories, preprints and archives. This presentation will present an overview of the technologies underpinning this model, and provide an update of the work-to-date and outcomes of the project.

On building a tool for finding datasets based on a list of researchers or publications

Washington L. R. Carvalho-Segundo1, Thiago M. R. Dias2

1Brazilian Institute of Information in Science and Technology (IBICT), Brazil; 2Federal Center for Technological Education of Minas Gerais (CEFET-MG), Brazil

​This proposal presents a tool developed in the Python language used to find related datasets

of a list of researchers or publications. This tool was applied to a list of articles that a specific group of

researchers had declared in their CVs. The target group was chosen based on the highest level that these researchers had obtained in a research productivity grant (1A). As a result, form a list of 1,227

researchers and more than 225 thousand deduplicated publications, it was possible to find 12,030 related datasets, were the most frequent access type is OPEN and the five most frequent related areas of research are Zoology; Chemistry; Genetics; Physics; and Agronomy. The proposed tool will be applied to facilitate populating the research data repository of the national funding agency in Brazil, but it can also be used in other more general contexts, extracting information from open databases, such as ORCID and Wikidata.

From Hard Drives to Globus: Supporting new workflows for large data transfer in libraries

Kara Handren, Amber Leahey, Kaitlin Newson

Scholars Portal, Ontario Council of University Libraries, Canada

As data continues to grow in size and volume, there is an equally growing need to provide new technical solutions to support large data transfer within academic library data services. This search for digital solutions became even more urgent with the outbreak of the COVID-19 pandemic, as restrictions on contact meant that existing workflows were no longer possible in a remote environment. This presentation summarizes recent work by Scholars Portal, a consortial library technology service, to develop infrastructure to support the transfer of big data in delivering library data services to students and researchers. It will focus on the use of Globus, a large data transfer tool, and workflows for integration into two different data repository systems - Dataverse & Scholars GeoPortal. We will discuss current workflows for the transfer of large files, and some of the use cases in academic libraries both now and into the future.

5:00pm - 6:00pmNetworking session
10:00pm - 10:55pmKeynote: Dr Tahu Kukutai & Dr Stephanie Russo Carroll
11:00pm - 11:55pmPanel: "Broken for All": the case for persistent identifiers for digital cultural heritage resources

Adrian Turner1, Deborah Holmes-Wong2, Zahid Rafique2, John Kunze1, Megan Lohnash3, Matthew McKinley4, Allison Lund5

1California Digital Library; 2University of Southern California Libraries; 3California Revealed; 4Omeka; 5Metropolitan New York Library Council

Cultural heritage institutions within the United States have made significant investments in building unique digital collections, broadly disseminating them on the web and through networks such as the Digital Public Library of America (DPLA). The items within these collections are referenced and cited in a range of sources -- from Wikipedia and scholarly articles, to archival finding aids and classroom lesson plans. However, the application of persistent identifier schemes (e.g. DOI, ARK) for unique digital resources is relatively uncommon within the gallery, library, archive, and museum communities. Relatedly, many repositories and digital asset management systems (DAMS) widely used in these communities simply do not support persistent identifier schemes "out of the box." Hence, the URLs for the items are extremely fragile. If the collections are migrated to a different repository, the URLs can easily break -- impacting downstream networks, confounding researchers, and compounding the problem of link rot. This panel will discuss the critical role of persistent identifiers in digital collections management and dissemination and some of the challenges in applying them. It will feature case studies demonstrating how commonly-available licensed and open-source repositories can be adapted to support them.

11:00pm - 11:55pmPresentations session 7

Arkisto: a repository based platform for managing all kinds of research data

Peter Sefton1, Marco La Rosa2, Michael Lynch1

1University of Technology Sydney; 2University of Melbourne

Arkisto is a comprehensive research data management platform which is based on the use of standards for storing and describing data with a view to making data Findable, Accessible, Interoperable and Reusable (FAIR). Central to Arkisto is the notion of a repository of data, using the OCFL (Oxford Common Filesystem Layout) standard to keep data organized and accessible over the long term, with separate services to acquire data, build discovery indexes, run preservation activities and provide access to analytical software. An Arkisto repository consists of a collection of datasets, stored as OCFL Objects where each Object is described using the Research Object Crate (RO-Crate) metadata standard - at least to the level of dataset, but potentially to the level of individual file resources, or the variables or other semantics within files. There are a number of tools which have been developed under the Arkisto banner, though any tool which works with the standards can be considered a part of the platform.

This presentation will explain the motivation for the platform, and give several examples of its use, as an Institutional Research Data repository, for special collections of various kinds, as well as lab-level research data repositories for instrument and field-sensor data.

Research Object Crate (RO-Crate) Update

Peter Sefton1, Stian Soiland-Reyes2

1University of Technology Sydney; 2The University of Manchester

RO-Crate <> is a means of describing and aggregating research data. It was introduced at Open Repositories 2019 as a marriage of the Research Object standard for describing reusable data objects and Datacrate, a data packing specification.

Research data is increasingly important to the OR community and some aspects of the standard are particularly relevant to a community where metadata is a key concern: the use of linked data; as a code vocabulary; the Portland Common Data Model for representing and interchanging repository content; and having a built-in HTML view of object metadata for offline and online objects.

RO-Crate is seeing increased adoption and interest in the research data management world and is relevant to research data repositories and data discovery. This presentation will provide an update on the standard with examples on its use.

Next Generation Repositories: Best laid plans

Tanya Holm

University of New South Wales (UNSW), Sydney, Australia

UNSW Library is migrating its institutional repository to a new platform, DSpace7. UNSW’s vision is to provide next generation infrastructure that enables UNSW research to be discoverable locally and globally. Since DSpace7 is still in development, a prototype was developed for UNSW in 2020 and evaluated against COAR’s guiding principles and design assumptions of next generation repositories. The presentation will reflect on the process of evaluating repository technology according to the COAR criteria while balancing the specific needs of the institution and users. It will also examine some of the challenges of implementation, which has brought to light the kinds of practical considerations, technical requirements and user expectations that often waylay best intentions.

A Causal Analysis of the Progress of Green Open Access

Masashi Kawai1, Koichi Ojiro1, Jun Maeda2, Masaki Nishizawa1, Kazutsuna Yamaji1

1National Institute of Informatics, Japan; 2Hokkaido University

More than 800 institutional repositories exist in Japan, but only a few institutions are active in registering journal articles. In this study, we analyzed the causal relation between the number of journal articles and librarians’ open access promotion initiatives to provide good practice guidelines to the institutional repository community. Quantitative analysis results using data from 87 domestic institutions showed “sending a request for fulltext”, a direct approach to researchers, was estimated particularly influential in increasing the number of journal articles. On the other hand, initiatives, for instance, in developing an “open access policy” or implementing “self-archiving” were found less influential. Additional collected data from 4 institutions regarding the “sending a request for fulltext” to understand the details of the causal relation revealed an annual success rate averaged 36.32%. Furthermore, the data showed the annual success rate of an institution implementing it most effectively averaged 55.82%, reaching a peak of 73.20%.

Date: Thursday, 10/June/2021
12:00am - 12:55amDeveloper track session 2

Deploying a Serverless Application as a Docker Image

Terrence W Brady

California Digital Library, United States of America

A serverless application consists of a defined runtime (such as Ruby 2.7 or Python 3) and a package of code. This presentation will illustrate 3 types of code packages of different complexity and will describe the challenges of bundling more complicated dependencies such as binary files. In December 2020, the AWS Lambda serverless architecture began to support Docker images as a deployment package. This packaging approach makes it possible to run tests against a code package before it is deployed. This presentation will demonstrate the test and deployment process for a serverless application within the Merritt digital preservation system. Other repository teams hosting their applications on AWS may discover some useful patterns for testing and deploying serverless code to AWS Lambda.

Live Demo: Create a Developer Workspace Challenge

Hardy Joseph Pottinger

California Digital Library, University of California, Office of the President, United States of America

Creating a developer workspace, a place in which you can write and test your code, can seem a daunting task. Sure, someone might have done it before, and they might even share their notes with you, but following those notes and arriving in the same place is like following a treasure map and expecting to get rich. There’s a better way, and I’m willing to prove it. This demo aims to show the process one goes through to craft a development environment with pretty much any tool, but will use Lando to conserve time. Lando is a way to bootstrap a useable development environment. It's built on Docker, and is like Docker-Compose, but much more transparent. It handles the boring details and helps you actually start developing with code in your favorite IDE, and all the services you need in Docker. It's all about making the life of a developer easier. It won't give you an application container you can deploy, but you will develop a deeper understanding of how all the pieces fit together, and you'll have a tool which can deploy those pieces to dev/stage/prod if you wish. The audience will participate in this demo.

Building Scalable Serverless Digital Repositories using Amplify Open-source Framework

Yinlin Chen, Tingting Jiang, Lee Hunter

Virginia Tech, United States of America

We develop digital repositories to disseminate different types of digital content and to promote the principles of open access. Our goal is not to provide access to information to just one individual institution or group but to anyone and everyone. With that ambitious goal in mind, how do we enable our digital repositories to have the high availability and flexible scaling capabilities necessary to face unforeseen demand? How do we utilize only the resources we need without the waste of overprovisioning resources? These are some of the most challenging issues that we face.

To achieve our goals, we moved from the traditional monolithic, server-based approach to serverless cloud infrastructure. By leveraging the services that AWS provides we have been able to boost the performance of our repositories 10x compared to that of our previous implementation that was hosted in-house. Our repositories now automatically scale up and down to meet any kind of traffic demands, without our intervention.

In this talk, we will demonstrate our scalable serverless digital repositories and show the opportunities to explore and reflect on the ways that repositories enable openness for all.

Constructing a Repository Test Strategy Built on Docker Containers

Terrence W Brady

California Digital Library, United States of America

The Merritt Digital Preservation system comprises a dozen microservices and supporting services. Our team found that it was not cost effective to maintain and patch a fleet of servers to support a development environment. Our solution was to replace our development server environment with a stack of Dockerized services. Once the stack was containerized, we discovered that we were able to create 3 variants of our development stack with different persistence strategies for database content and cloud storage. With these variants, the team has been able to support a variety of testing scenarios.

12:00am - 12:55amPresentations session 8

Repositories, are you ready to ROR?

Maria Gould2, Liz Krznarich1, Daniella Lowenberg2

1DataCite; 2California Digital Library

Repositories have been strong adopters of open persistent identifiers (PIDs), and, in many cases, they have come to rely on PIDs as an integral part of their systems. While open PIDs for content and people have existed for some time, no open identifier for research organizations has existed...until now! With the launch of the Research Organization Registry (ROR), it’s now possible to unambiguously connect organizations to content and people using an open, community-led organization registry. This talk will provide an introduction to ROR and its services, explore an example implementation in the Dryad Digital Repository, and discuss future opportunities/needs for ROR adoption among repositories.

Next Generation Library Publishing project: Integrating open-source IR and publishing solutions

Catherine Mitchell1, Katherine Skinner2, Kristen Ratan3

1California Digital Library, University of California; 2Educopia Institute; 3Stratos

Commercial publishers, platform-builders, and service providers derive enormous profits within the scholarly communication industry through aggressive and opaque business practices that are often at odds with the values that drive academic research and scholarship. Academy-led and campus-based alternatives exist, including a growing range of open access library publishers, but they need more robust, flexible, and interoperable tools and workflows to provide competitive scholarly publishing services to editors and scholars. The library publishing community seeks open-source, community-governed solutions and a modular architecture that can mix existing and new technologies.

Through the Next Generation Library Publishing project (2019-22, funded by Arcadia Fund), Educopia, California Digital Library (CDL) and Strategies for Open Science (Stratos), in collaboration with COAR, LYRASIS, and Longleaf Services, aim to advance the role of the institution in scholarly communication via the following deliverables:

- -Targeted technology projects to fill gaps and share data between existing open-source platforms (DSpace, OJS, and Janeway)

- -A values & principles framework for the evaluation of vendors and technology partners

- -A catalog of open-source tools and platforms available for scholarly publishing

- -Mission-aligned service providers to host and manage this open infrastructure for library publishers and IR managers

Join us to learn more about this project!

Dataset Search: An open source tool to support data discovery, reuse, and analytics

Sara Mannheimer, Jason A. Clark, James Espeland, Jakob Schultz, Rhonda Borland, Kyle Hagerman, Daniel Laden

Montana State University, United States of America

A number of recent projects focus on indexing research data repositories, using various strategies. Montana State University (MSU) Library aims to bring together the ideas of these existing projects, as well as some innovations, to encourage discovery and reuse of datasets from MSU researchers. MSU librarians gave a presentation at OR2018 in which we walked through an early prototype of the MSU Dataset Search tool. This presentation will update the Open Repositories community on the completed tool, launched in January 2020. The goals of the tool are: (1) promote discovery for datasets from MSU; (2) present MSU research data in one central index; (3) support discovery for restricted data; and (4) showcase research data as a legitimate research product that can be cited, reused, shared, and analyzed—including data dashboards that provide new transparency and visualizations of research data in our community. We will also present a technical discussion of how the software works and design decisions that we considered. We will finish the session by discussing lessons learned, Search Engine Optimization efforts, the potential for this tool to be implemented at other small and mid-sized institutions, and national data discovery initiatives.

8:00am - 8:55amPresentations session 9

Plan S and Repository PIDs

Adam Vials Moore1, Sally Rumsey1,2, James Kerwin3, Martin Wolf3

1Jisc, United Kingdom; 2cOAlition S; 3University of Liverpool

To embed the wider importance of building a connected infrastructure for open science, the technical requirements for a repository to be compliant with Plan S include each funded deposited work having its own persistent identifier. For a Version of Record this will typically be supplied by the publisher (most often in the form of a DOI). In this presentation we will discuss the technical and policy considerations of how to add persistent identifiers to Author Accepted Manuscripts. This ensures that they are also fully available for discovery and citation to participate as first order participants in the open research ecosystem

Experience in Moving Toward An Open Repository For All

Tyng-Ruey Chuang1, Cheng-Jen Lee1, Chia-Hsun Wang1, Yu-Huang Wang2

1Academia Sinica, Taiwan; 2Independent Scholar

We report on our experience in building a domain-specific research data repository and in moving it toward an open repository for all to deposit datasets.

Plan S and Open Access via repositories: Unity in diversity

Johan Rooryck1, Sally Rumsey2

1Leiden University, cOAlition S; 2Jisc, cOAlition S

cOAlition S comprises research funding organizations that have agreed to adopt the set of principles that form Plan S. As autonomous entities, when aligning their policies to Plan S, each may differ in detail of how it is implemented. Each funder is at liberty to express a preference for one route to Open Access over another, and individual funders may have differing preferences for deposit in a repository.

cOAlition S encourages a range of solutions towards Open Access that encourage innovation, allow for geographical, financial, & cultural preferences & differences, and that can influence the broader open access scholarly landscape. cOAlition S encourages diversity in business models to achieve OA, recognizes journal and repository routes to OA, and accepts community norms rather than specifying technical standards for repositories. It has also taken steps to ensure researcher choice of publication venue by adopting its Rights Retention Strategy, enabling a means to publish in any journal using the repository option (thereby encouraging diversity of journals), whilst meeting funder OA requirements.

This presentation will illuminate how diversity in OA solutions, supported and encouraged by cOAlition S, is brought together under a single unified aim – immediate OA with an open licence.

8:00am - 8:55amPresentations session 10

Results from the OpenAIRE Call for Innovation: Enrich local data via the OpenAIRE Graph

Andrea Bollini1, Susanna Mornati1, Giuseppe Digilio1, Jordan Piščanc2, Michele Artini3, Claudio Atzori3

14Science, Italy; 2University of Trieste, Italy; 3ISTI-CNR, Italy

In the context of the OpenAIRE Open Innovation programme, 4Science proposed the project named “Enrich local data via the OpenAIRE Graph” to develop two data products for the repository community, and more specifically for the DSpace and DSpace-CRIS platforms. The solutions were built on top of the OpenAIRE Research Graph and the OpenAIRE Broker Service to enrich the data locally available and identify new data of interest reducing the friction of deposit.

The Open Innovation programme selects innovative projects in the field of Open Science to develop products and services linked to scholarly works, repositories, data management, OpenAIRE infrastructure and OpenAIRE services.

The 4Science proposal was awarded in 2020 and went through an accurate process of design, validation, prototyping and finalization of the solution, in continuous dialogue with the OpenAIRE team and gathering feedback from the repository community as well. In Feb 2021 the project was completed after a pilot phase with the University of Trieste, and the implemented solution was released as free open source.

This presentation aims to show the developed solution and discuss how collaboration between large initiatives such OpenAIRE and repository platforms can enrich both.

ORCID and OpenAIRE Compliance for DSpace

Courtney Earl Matthews1, William Roy1, Andrea Bollini2, L. Andrea Pascarelli2, Susanna Mornati2, Kathleen Shearer3, Pierre Lasou4

1Queen’s University Library, Canada; 24Science, Italy; 3COAR; 4Université Laval

Members of the Canadian Association of Research Libraries’ Open Repositories Working Group (CARL-ORWG) have a common goal of making research outcomes generated at their universities openly available to the global knowledge commons. In 2018 a subset of CARL-ORWG led by Queen’s University pooled their resources and hired 4Science to develop code to make aggregation from DSpace current versions (5 & 6) into OpenAire possible. In discussions with 4Science it was proposed and decided that this development work include a patch for adding ORCIDs to the required OAI-PMH feed. This presentation will provide background on this completed work including the principles and goals for open research shared by CARL members and 4Science and the details of the ORCID patch. DSpace is the most popular open source repository platform in the world and this implementation will bring benefit to the vast global community using the latest versions of DSpace, besides providing guidance and inspiration to other communities.

How to ensure “good” data? Quality assurance by research data repositories

Maxi Kindling, Dorothea Strecker, Vivien Petras, Yi Wang

Humboldt-Universität zu Berlin, Berlin School of Library and Information Science,Germany

Data quality is of high interest to all stakeholder groups in research data sharing. Nevertheless, data quality assurance by research data repositories is a little explored field. This contribution examines the role of research data repositories in data quality assurance based on a mixed-methods study carried out by the project re3data COREF. The main focus is on the presentation of the results of a survey among repository operators and a framework for data quality assurance by research data repositories.

9:00am - 9:55am24x7 session 4

Ready-made or tailor-made? Seeking seamless depositing solutions for multi (LMIC) country qualitative data

Moni Choudhury1, Hani Salim1,2, Hana Mahmood3, Dhiraj Agarwal4, Tathagata Bhattacharjee4,5, John Norrie1, Sanjay Juvekar4

1University of Edinburgh, United Kingdom; 2Universiti Putra Malaysia, Malaysia; 3Neoventive Solutions, Pakistan; 4KEMHRC, Pune, India; 5LSHTM, United Kingdom

NIHR-RESPIRE collaboration spans across four South Asian, low-middle income countries (LMICs) of Bangladesh, India, Malaysia and Pakistan, and hosted by the University of Edinburgh. One of the deliverables of RESPIRE is to deposit and share research data in an open-source repository. We explored if all RESPIRE data could be deposited into one open repository. Our methodology was organic but included: retrospective review of all RESPIRE projects’ proposals and/or protocols; remote and face-to-face discussions with the RESPIRE research project teams about data management, and ready-made data depositing solutions including repositories at the University of Edinburgh. We piloted both quantitative and qualitative data submissions into a preferred open repository: Edinburgh DataShare. Quantitative data is relatively straightforward in being de-identified and can be made available in open repositories, but qualitative data is more challenging. The data from a RESPIRE PhD project highlighted that raw qualitative data could not be deposited openly due to the sensitive nature of the data and current lack of guidance on de-identification of raw data such as images. Other raw data include original audio recordings, verbatim and translated transcripts. Similarly, discussions at the RESPIRE annual scientific meeting highlighted that qualitative data presented this challenge across all RESPIRE partners.

Interoperability as a service

Tamsin Margaret Burland, John Kaye, Paul Stokes, Howard Williams

Jisc, United Kingdom

As the use of digital research systems technology continues to grow and evolve, Research Organisations are increasingly struggling with issues of systems incompatibility and manual re-keying of information. As part of its work to build an integrated repository and digital preservation service, Jisc has built an open interoperability framework based on an open and extensible data model and open API. This framework already supports integrated workflows among a number of repositories, commercial CRISs and digital preservation services. In this talk, we will discuss how this framework could be offered as a service to institutions to provide workflow integration with other research systems, including those used to manage research grants and contracts, ethics, research impact and web content.

Would auto-translation of metadata enhance discovery and impact of research data?

Paul L. S. Stokes, Tamsin Burland, John Kaye, Howard Williams

Jisc, United Kingdom

It’s widely accepted that good quality metadata is a significant factor in discovery and reuse of digital objects. Because of the way data standards and infrastructure has evolved, much of the metadata currently in circulation is either in English and/or based upon standards formulated in English. This reduces discovery, impact (and potentially reuse) in areas of the world where English is not widely spoken. The converse is also true when it comes to the impact of non-English digital objects and metadata on the English-speaking world. There are many auto-translation tools available that lend themselves fairly well to the translation of words and short passages of text… such as keywords/phrases and abstracts. This presentation explores the potential (and difficulties associated with) the incorporation of such tools in deposit and preservation workflows.

Lessons learnt in setting up a "one person repository"

Ravi Murugesan

Auroville, India

Librarians in the developing world often work in resource-constrained contexts without substantial IT support, yet they may be tasked with implementing institutional repositories. Mature open source applications do exist for this purpose - notably EPrints and DSpace - so the issue is not about developing a repository from scratch. Instead it is about selecting a suitable open source application, understanding what costs are involved, carrying out (or supervising) the installation, and of course, depositing items. In 2019, I began to do this work for my institution in south India. I used EPrints to set up a repository that is hosted on a competitively priced cloud server (on a plan that costs less than 4 US Dollars per month). I carried out the installation by myself and I am also the only person in charge of adding items to the repository. My experience leads me to believe that a one person repository (inspired by the concept of one person library) is indeed possible on a small scale. I hope to convince librarians that they can describe and minimize the IT support and funds they need if they are looking to get started with an institutional repository.

Pavia Digital Library: enhancing and supporting interoperability within the Cultural Heritage Domain with DSpace-GLAM

Gabriele Rossini2, Paolo Nassi2, Roberto Canevari2, Massimo Aurelio2, Claudio Cortese1, Emilia Groppo1, Andrea Bollini1, Riccardo Fazio1, Matteo Perelli1, Francesco Pio Scognamiglio1

14Science, Italy; 2Università degli Studi di Pavia, Italy

The digital cultural heritage of the University of Pavia is characterized by its size and variety: archival materials, ancient books, museum objects and documents related to the history and activity of the University make up one of the most important heritage for the study of Italian modern and contemporaneous literature and history.

Until now, these materials could not be explored in an integrated way within a Digital Library. The differences in data models and metadata standards adopted made it impossible to guarantee interoperability between the different cultural resources. These problems have now been overcome through the use of DSpace-GLAM, an extension of DSpace specifically structured for cultural heritage management. The presentation starting from Pavia University case study will illustrate how, after mapping different data structures on DSpace-GLAM flexible data model, it is possible not only to navigate through the pages of the various documents, but also to study the historical and geographical context of the digital objects, exploring people, events and places related to them; therefore, moving the application from a Digital Library to a Digital Humanities Platform.

Breaking language barriers

Bram Luyten

Atmire, Belgium

Early localization support has been a factor in the global uptake of the DSpace repository platform. As great as it is to see regional communities really make DSpace their own, it makes it very clear that support for specific localization use cases still needs to be added to the core DSpace platform.

This presentation will highlight localization improvements and multi-language support, both in the core DSpace 7 platform, as well as in Atmire's Open Repository implementation of DSpace 7.

A National PID Landscape and Beyond

Adam Vials Moore1, Monica Duke1, Balviar Notay1, Christopher Brown1, Alice Meadows2, Josh Brown2

1Jisc, United Kingdom; 2MoreBrains Cooperative

The information landscape for infrastructure that captures and exposes scholarly communications and the associated individuals, organisations and connected entities has developed over the last several years. A set of persistent identifiers (PIDs) allow participants and their interactions and connections to be consistently captured and passed around within the infrastructure. In this presentation we look at five priority persistent identifiers, important to bring about a connected web of open and accessible scholarly information to enable high quality science and a national community of practice we are facilitating around this area – the Research Identifier National Coordinating Committee (RINCC)

9:00am - 9:55amPresentations session 11

DSpace-CRIS 7 is here!

Andrea Bollini, Mykhaylo Boychuk, Pasquale Cavallo, Claudio Cortese, Giuseppe Digilio, Riccardo Fazio, Damiano Fiorenza, Luca Giamminonni, Corrado Lombardi, Alessandro Martelli, Luigi Andrea Pascarelli, Matteo Perelli, Francesco Pio Scognamiglio, Susanna Mornati

4Science, Italy

Open Repository 2021 will sign the momentum of the DSpace and DSpace-CRIS 7 release. This presentation will demonstrate the key features of DSpace-CRIS explaining how it differs from a plain DSpace in providing a full fledged open source CRIS system promoting open science.

Indeed, other than allowing to record data about any kind of entities and their relation DSpace-CRIS focus on end-to-end functionalities to support business processes and expectations such as

- the self management of researcher profiles with a full bidirectional ORCID integration;

- project, funding and awards management providing the ability to track and organize them over time adding more details and information when needed;

- simplify the administration, improve the data quality and reduce the administrative burden;

- extended interoperability supporting the most richer exchange formats such as OpenAIRE CERIF.

DSpace-CRIS 7 is a production-grade, ready-to-use, CRIS solution under real usage and data load, with early adopters also at National scale.

The DSpace-CRIS 7 is not just a revamp of the features already available in previous versions but taking the opportunities offered by the new modern architecture provides completely new features and integrations.

Migration path from previous versions will be discussed.

DSpace 7 - Enhanced Submission & Workflow

Andrea Bollini, Giuseppe Digilio, Claudio Cortese

4Science, Italy

After more than 4 years of development, thousands of commits and hours spent in developing the new version DSpace 7 is finally close to being released and expected to be presented at Open Repository.

One of the core components [1] of DSpace is the submission and workflow process that was largely redesigned and improved in the new version.

The presentation will provide a deep dive into the new Enhanced Submission and Workflow features of DSpace 7, including how to configure, customize & use this feature (and differences with DSpace 6 and below) showing how to take advantage of the offered flexibility taking examples from customization and extensions developed by early adopter projects.

This presentation is a renewed version of a similar presentation held at OR2019 [2] updated to reflect the final status of the features for DSpace 7, include details about the transition from the Bibliographic Transformation Engine (BTE) [3] used in previous DSpace version to an updated Live Import Framework [4] and inspirational examples from running projects.

DSpace 7 - Configurable Entities

Lieven Droogmans, Ben Bosman

Atmire, Belgium

DSpace 7 has been extended with the possibility of “Configurable Entities” in response to a growing need for describing more types of objects and relations between objects as well as compound objects. Examples include: authors, projects, datasets, grants, monographs, lecture series, … .

The new Configurable Entities feature and new concepts in DSpace 7 will be presented such as relations between items, virtual metadata, … .

Defining an entity model through configuration is made possible without using Java classes for the specific entities. To achieve this, the concept starts from the current DSpace Item object and extends it, allowing institutions to keep using DSpace with standard items. The entities in a custom entity model are items that can be typed, and relations between items of different types can be created. Several entity models can be defined and can exist alongside one another in one repository.

Finally, this talk will briefly touch on the next steps for future versions of DSpace.

This presentation is an update to the presentation “DSpace 7 - The Power of Configurable Entities” presented at OR2019 as the work has now been finalized and the features announced at OR2019 have now been implemented and can be demonstrated.

12:00pm - 12:55pmNetworking session
1:00pm - 1:55pmKeynote: Dr Bianca Amaro
2:00pm - 2:55pmPanel: Dataverse Community & CoreTrustSeal: Certifying generalist data repositories

Katherine Mika1, Sonia Barbosa1, Philipp Conzett2, Robert R. Downs3, Jonathan Crabtree4, Ceilyn Boyd1, Merce Crosas1

1Harvard University; 2UiT The Arctic University of Norway; 3Center for International Earth Science Information Network (CIESIN), Columbia University; 4Odum Institute, University of North Carolina Chapel Hill

CoreTrustSeal Certification is an important tool that helps researchers and practitioners evaluate the trustworthiness of a dataset and a data repository, yet its certification model can be challenging for generalist repositories to meet. This panel will discuss strategies for meeting certification standards for Trustworthy Data Repositories (TDR) across a variety of repositories with different methods of appraisal, curation, preservation, and organization. The audience will be encouraged to add to the discussion and invited to provide feedback on the concepts presented by the panelists.

2:00pm - 2:55pmPresentations session 12

Open for all. Reusable for whom? A review of what data reusers want and how data repositories can deliver

Lisa R Johnston1, Ixchel M Faniel2, Katie Wissel3

1University of Minnesota, United States of America; 2OCLC Research, United States of America; 3New York University

Understanding how data reusers seek and evaluate potential data for reuse will aid data curators, data managers, and developers in the open repository field. We will review past studies of data reusers, specifically a qualitative study of 105 researchers from three disciplinary communities: quantitative social science, archaeology, and zoology. The study identified 12 types of context information that data reusers mention needing when deciding whether to reuse data. Next, we will use the context types to create a feature set and assess how data repositories provide the needed context information to users. Finally, using findings from our assessment, we will showcase desirable features in use to prototype the design of a reuser-oriented data repository that developers can use to improve their data repository interface.

Open for all but for how long? Roles, responsibilities and accountabilities in the preservation of research data

Amy Currie, William Kilbride

Digital Preservation Coalition, United Kingdom

In the open science community, digital preservation aligns closely to the FAIR principles and is delivered, albeit unevenly, through infrastructures comprising technology (i.e., repositories), know how (i.e., staff) and ‘know why’ (such as policy).

In line with conference sub-theme of ‘Supporting open scholarship and cultural heritage’, this presentation will explore and describe the strengths and weaknesses of the open science community at the outset of the EOSC Association. It draws from the recent ‘FAIR Forever’ study commissioned by the EOSC Sustainability Working Group and funded by the EOSC Secretariat to establish the strengths and weakness of digital preservation capability in open science in Europe.

The FAIR Forever study involved three stages of research: a desk-based assessment of the EOSC vision, interviews with representatives of EOSC stakeholders, and focus groups comprised of digital preservation specialists and data managers in research repositories.

This presentation will explore and describe the study’s key findings relating to digital preservation capacity within EOSC and for the research community more broadly. The presenters will make recommendations for coordinated actions by repositories, researchers, and stakeholders to better ensure the long-term preservation of—and access to—research data in all its forms.

The Entity-Relation Metamodel from Repositories to Aggregators - The case of LA Referencia and RCAAP jointship Project

José Carvalho1, Lautaro Matas2, Washington Segundo3, Paulo Graça4, Paulo Lopes4

1University of Minho, Portugal; 2LA Referencia; 3IBICT; 4FCT|FCCN

This proposal demonstrates the work developed at the harvester level in order to incorporate the concept of entities coming from repositories. To achieve this, technical interoperability and guidelines have been implemented and also guarantee the coexistence of repositories with and without entities, as well as different types of aggregation processes and different metadata profiles. The objective of these developments is to support a complete representation of the repository data model at the harvester level and to provide added value services for all harvested content based on an open infrastructure.

3:00pm - 4:00pmClosing & Ideas challenge

Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: OR2021
Conference Software - ConfTool Pro 2.6.142+TC
© 2001 - 2021 by Dr. H. Weinreich, Hamburg, Germany