General track 5: Research data repositories
Wednesday, 28/Jun/2017:
3:30pm - 5:00pm

Session Chair: David Minor, UC San Diego
Location: Ballroom C
Hilton Brisbane

3:30pm - 4:00pm

An Open Repository Solution for UK Research Data Management Shared Service

Nevelina Aleksandrova, Richard Jones

Cottage Labs, United Kingdom

Since the introduction of strongly-worded research data mandates in the UK, Jisc has driven a process of assembling a core group of organisations which can offer all or part of a Research Data Shared Service (RDSS) at a national level. This presentation is from the perspective of one of those organisations. Cottage Labs is a software development company and service provider in open access, and has entered an open source repository platform (Hydra/Fedora) into the RDSS framework, where it will be developed and integrated with other key scholarly infrastructure systems (such as aggregated discovery services). This presents a great opportunity for Hydra to gain traction in Europe, and also for the community to benefit from both software and data modelling work that will happen in the coming 2 years. The presentation will discuss the background of the shared service, its motivations and how it came to exist, and look at the work that's going on to provide such an ambitious piece of national infrastructure.

4:00pm - 4:30pm

The Canadian Federated Research Data Repository: A Modestly Large Collaboration

Alexander Garnett1, Todd Trann2

1Simon Fraser University, Canada; 2University of Saskatchewan, Canada

While open repository platforms have continued to move forward, there are still notable gaps in cross-repository and cross-institution discovery, as well as in handling terabyte-scale datasets that are common in the natural sciences. The Canadian Federated Research Data Repository (FRDR) is a collaboration between Canadian university libraries, the national high-performance computing provider *Compute Canada*, and the commercial cloud storage vendor *Globus* that addresses these gaps. With participating university partners in Canada, we have developed a new metadata harvester, a new discovery interface, and a new repository backend, and we are developing a service model for a sustainable national platform. We have also done preliminary work to integrate Archivematica into our platform, to automate digital preservation processing on ingested content. Although this project differs from many open repository platforms in that it is not designed to be deployed locally by libraries and contains some closed-source components, we present it as an example of scaling our services to a national level.

4:30pm - 5:00pm

Launching a researcher-focused data repository at Caltech using the Invenio 3 platform

Thomas Morrell

California Institute of Technology, United States of America

The vast majority of digital data associated with scientific research are not accessible online. While there are many challenges associated with making research data openly accessible, one significant challenge is usability and long term availability of storage services. Open institutional repositories have the potential to support data preservation and sharing of valuable raw and processed data from local research efforts. However, research data are inherently heterogeneous and requires researcher involvement to accurately describe the nature of the deposited data files. We used a researcher-focused design principle to develop a data repository on the Invenio 3 platform with TIND. These principles included automating the deposit process as much as possible, employing standard metadata to support discoverability and future applications, and providing API access so the repository can power other visualization and analysis services. The repository includes DOI minting to support data citation, ORCID identifiers to facilitate credit attribution, and Github integration to encourage software archiving. The newly launched repository captures research data that might otherwise be lost due to poor storage and organization practices, and enables researchers, the library, and the Caltech Archives to develop tools and preservation strategies around this valuable resource.