Conference Agenda

Session Overview
24x7 presentations 1: Research and non-publications repositories, Open Science
Wednesday, 28/Jun/2017:
11:00am - 12:30pm

Session Chair: Natasha Simons, Griffith University
Location: Queen's Ballroom
Hilton Brisbane


OpenAIRE-Connect: Open Science as a Service for repositories and research communities

Paolo Manghi1, Pedro Principe2, Anthony Ross-Hellauer3, Natalia Manola4

1CNR-ISTI, Italy; 2University of Minho, Portugal; 3University of Göttingen, Germany; 4University of Athens, Greece

OpenAIRE-Connect fosters transparent evaluation of results and facilitates reproducibility of science for research communities by enabling a scientific communication ecosystem supporting exchange of artefacts, packages of artefacts, and links between them across communities and across content providers. To this aim, OpenAIRE-Connect will introduce and implement the concept of Open Science as a Service (OSaaS) on top of the existing OpenAIRE infrastructure , by delivering out-of-the-box, on-demand deployable tools in support of Open Science. OpenAIRE-Connect will realize and operate two OSaaS services. The first will serve research communities to (i) publish research artefacts (packages and links), and (ii) monitor their research impact. The second will engage and mobilize content providers, and serve them with services enabling notification-based exchange of research artefacts, to leverage their transition towards Open Science paradigms. Both services will be served on-demand according to the OSaaS approach, hence be re-usable by different disciplines and providers, each with different practices and maturity levels, so as to favor a shift towards a uniform cross-community and cross-content provider scientific communication ecosystem.

RDM skills training at the University of Oslo

Elin Stangeland

University of Oslo, Norway

In this presentation I propose to talk about the skills development program for research data management (RDM) we are developing at the University of Oslo (UiO) in Norway.

IIIF Community Activities and Open Invitation


International Image Interoperability Framework (IIIF) Consortium, United States of America

The International Image Interoperability Framework (IIIF) is a growing community of libraries, museums, software firms, and digital image repositories working to define, develop, cultivate, and document shared technologies that support interoperability for web-based image delivery. Digital images provide access to an increasing number of cultural heritage materials on the web (such as manuscripts, newspapers, books, paintings, photographs, sheet music, etc.). However, many of these resources have been locked in to bespoke websites that are often challenging to maintain, with limited functionality for end users. IIIF provides a solution to this problem, utilizing linked data and shared application programming interfaces (APIs) to enable enhanced functionality, data portability, and sustainability for digital image repositories. As a community-driven initiative, IIIF relies on discussion and input from individuals and institutions involved in digital image repositories. This 24x7 talk will give a brief introduction to IIIF, with a focus on current community activities and ways to get involved.

Data Management and Archival Needs of the Patagonian Right Whale Program Data

Harish Maringanti, Daureen Nesdill, Victoria Rowntree

University of Utah, United States of America

Librarians at academic institutions are usually focused on current research output, which tends to be electronic in nature. But, researchers who have been working for decades have multiple filing cabinets of legacy data in print and slides, which are not accessible to the research community. In the biodiversity and ecosystem fields, historical information plays a critical role and studies require the analysis of trends, adaptations, and long-term relationships. Legacy data, therefore, can be as important as current data. A significant problem is that legacy data are not yet in digital formats.

At the University of Utah, a collaborative research project between Marriott Library and Dept. of Biology resulted in the Library digitizing a subset of one such historical and unique dataset - the Patagonian Right Whale Program dataset. Aerial photographic surveys of the population have been conducted annually since 1971 and are the studies’ primary source data containing over 80,000 slides. The majority of the slides have never been digitized, and many are more than 40 years old. This project aimed to digitize 3000 slides and investigate archiving needs of the Patagonian Right Whale Program data. In this presentation, we will share updates on the project and discuss next steps.

Repository driven by the data journal: real practices from China Scientific Data

Lili Zhang, Jianhui Li, Yanfei Hou

Computer Netwrok Information Center,Chinese Academy of Sciences, China, People's Republic of

As a data journal, China Scientific Data publishes data papers and accommodates datasets in repository for visit and reuse which is principal in promoting such scholar data communication. Based on the real practices running over a year, the analysis mainly focuses on the repository essentials especially difficulties driven by data publication.

The lightning talk begins with drawing of the whole publishing workflow and figures out the role and responsibility for repository as well. Data repository mainly takes responsibility for findable, accessible, intelligible and reusable of certain datasets. Among all the publishing work, massive data, complex data as well as dynamic flow data publication are tough issues. China Scientific Data suggests solutions may include on-demand sample publication, certain data infrastructure publication together with data version maintenance. Besides, after-publishing tasks are tough either, such as to keep long term accessibility and reuse of datasets. Therefore, sustained operational mode is required. Proper organizational mode chosen should keep published data available all the time. While business model for financial balance in repositories calls for extension of service chain so as to provide further value added products and services as well.

RDM and the IR: Don’t Reuse and Recycle - Reimplement

Dermot Frost1, Rebecca Grant2

1Trinity College Dublin, Ireland; 2National Library of Ireland

Research Data Management (RDM) has become the buzzword du jour in the repository world in the last few years. Following mandates from funding agencies internationally (for example the European Commission, the EPSRC (UK) and the National Science Foundation (US)), institutions are scrambling to react and provide RDM support to their researchers. While some institutions have developed specialised RDM workflows and services, others have taken the pragmatic approach of re-using their existing publications repository for RDM. In this talk we will present some of the issues associated with this approach. The FAIR Data principles aim to make data Findable, Accessible, Interoperable, and Re-usable; processes such as ingesting a zip file with some basic metadata into a traditional IR do not meet FAIR standards. This approach also has significant technical challenges as the scale and complexity of research data is not appropriately modeled in a publications repository, and could impact on the operation of the repository for the purpose it was originally intended.

Developing a university wide integrated Data Management Planning system

Rebecca Deuble1, Andrew Janke1, Helen Morgan1, Nigel Ward2

1University of Queensland, Brisbane, Australia; 2Queeensland Cyber Infrastructure Foundation, Brisbane, Australia

The University of Queensland (UQ) has instigated a project which aims to construct, pilot and refine an integrated Data Management Planning (iDMP) system that can auto-provision data storage following the completion of a data management plan (DMP). The Queensland Cyber Infrastructure (QCIF) has agreed to auto-provision and allocate a minimum of 1TB of storage space for each project that completes a DMP. This unmanaged storage (i.e. day to day working space) will be made available to edit by any collaborator (both UQ or non UQ) that is authorized by the UQ lead chief investigator of the project. It will be accessible via a web interface (for all authorized users) and via computer desktops for authorized UQ researchers. Additionally, the iDMP system will provide workflows to allow for the migration of unmanaged data to managed datasets that can be linked to publications via the institutional repository. For the pilot testing, RHD students and their supervisors will be recruited, where feedback will be sought to ensure that the system meets their research and discipline specific needs. This presentation will discuss and demonstrate how this new system will work, and explain the benefits to researchers, the university and the wider community.

Extending the boundaries of Open Repositories: an integrated communication system for cultural heritage in smart cities

Christina Birdie1, Nabonita Guha2

1Xavier Institute of Management & Entrepreneurship, Electronics City Phase II, Bangalore 560100, Karnataka, India; 2Jawaharlal Nehru Centre for Advanced Scientific Research, Jakkur Post, Bangalore 5060064, Karnataka, India

Smart city projects have just began in India with over 90 cities been identified under this project. India being a country rich in its cultural heritage has enormous volume of data related in this domain to focus on the newer model of cities. Hence, this paper describes an open data repository model for Indian smart cities to showcase its cultural heritage and develop web services useful for tourism industry.

Some use cases of relevant web services are listed. These use cases created are the focal point to identify different data sources. The major challenge faced for data collection from diverse sources, was conversion and standardization of the data. Hence, the paper also proposes a uniform format to convert data from handwritten media to digitally re-usable format. The repository prototype thus proposed, shows the standardization and interlink of data to develop web services useful for people living in the smart cities.

Preserving and reusing high-energy-physics data analyses

Sünje Dallmeier-Tiessen, Robin Lynnette Dasler, Pamfilos Fokianos, Jiří Kunčar, Artemis Lavasa, Annemarie Mattmann, Diego Rodríguez Rodríguez, Tibor Šimko, Anna Trzcinska, Ioannis Tsanaktsidis

CERN, Switzerland

The revalidation, reuse and reinterpretation of data analyses require having access to the original virtual environments, datasets and software that was used to produce the original scientific result. The CERN Analysis Preservation pilot project is developing a set of tools that support particle physics researchers in preserving the knowledge around analyses so that capturing, sharing, reusing and reinterpreting data becomes easier. In this talk, we shall notably focus on the aspects of reusing a preserved analysis. We describe a system that permits to instantiate the preserved analysis workflow on the computing cloud, paving the way to allowing researchers to revalidate and reinterpret research data even many years after the original publication.