Open Repositories 2026
Online | 8 - 11 June 2026
Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
Please note that all times are shown in the time zone of the conference. The current conference time is: 14th Apr 2026, 11:39:50am UTC
|
Agenda Overview |
| Session | ||
Developer Track: DSpace 2 (Automated Metadata and Full Text Population)
| ||
| Presentations | ||
Fill The Gap – Automated retrieval of full text from emerging open APIs 1University of Galway, Ireland; 2Atmire Populating an open repository with high-quality, consistent metadata is a substantial task. The challenge becomes even harder when records also need to be enriched with the corresponding full text at scale, particularly when authors are not involved in deposit workflows. In 2025, the University of Galway and Atmire developed and deployed a DOI-driven workflow to enrich metadata-only repository records with open full text links. The tool queries multiple open services using the DOI, selects the most credible full text candidate, and records both provenance and outcomes to support review and reporting. In production, this approach identified and attached thousands of full text PDFs with minimal manual intervention, while surfacing cases that require follow-up due to redirects, inconsistent landing pages, or unclear licensing signals. The implementation is designed to be extensible, with additional sources and local policy rules added as needed. The session will demonstrate the Google Apps Script and Google Sheets version, describe key design trade-offs (accuracy, coverage, validation, and rate limiting), and share an approach that other repository teams can adapt to their own infrastructure. Currently supported sources include OpenAIRE, Unpaywall, CORE, and OpenAlex. Reducing Barriers: Automating Metadata Extraction in Submission Forms for DSpace Repositories KEEP Solutions, Portugal As digital repositories evolve at the intersection of people, practice, and emerging technologies, the burden of manual metadata entry remains a significant barrier to the timely dissemination of open research. This paper presents a novel integration for the DSpace platform designed to streamline the submission process through automated metadata extraction. The proposed functionality leverages an external API powered by Artificial Intelligence (AI) to analyze uploaded documents in real-time. By identifying and mapping key bibliographic data directly from the file content, the system automatically populates submission forms, reducing human error and cognitive load for depositors. Central to this development are two critical considerations: interoperability and privacy. The architecture utilizes a flexible API framework that allows the repository to request services from various external providers, ensuring the system remains adaptable to future technological shifts. Furthermore, the integration is built with a "privacy-by-design" approach, ensuring that sensitive file data is handled securely during the AI analysis phase. By automating the "practice" of data entry, this feature moves us closer to an "Open to All" ecosystem where researchers can focus on dissemination rather than administration, ultimately fostering a more efficient and inclusive repository environment. | ||