The aim of this CEC is to introduce open and reproducible workflows for data handling and thus foster digital literacy skills. For this, we will introduce OpenRefine (https://openrefine.org/), a powerful, free open source tool. OpenRefine allows you to work with and understand large datasets and clean them, i.e. remove inconsistencies from datasets. Furthermore, you can quickly get started with enriching a dataset with data from web services. All steps are documented and workflows can be replicated and shared with others. First, we will teach the basic functions of OpenRefine, like data import, layout of the user interface, faceting and filtering, and undoing and redoing steps of the workflow. Then we will introduce GREL (General Refine Expression Language), the programming language used in OpenRefine to transform data. Finally, we will introduce you to working with Application Programming Interfaces (APIs): How do they work, what should you pay attention to, and how can you use them from within OpenRefine? As examples, we will look at the CrossRef API and the E-utilities API provided by the NCBI, namely ESearch and ESummary. As an example, we will work with bibliographic data of scientific articles and look at how we can enrich them. We will teach how to fetch metadata such as the journal title based on a DOI, fetch the PubMed ID based on a DOI, or fetch the Publication Types found in PubMed based on a PubMed ID.
ID: 202 / CEC3: 1
Continuing Education (CEC) Session
Introduction to data handling and fetching data from web services with OpenRefine
1University Library, University of Augsburg, Germany; 2University Library, University of Regensburg, Germany
Biography and Bibliography
Evamaria is subject librarian for medicine and head of the medical library at Augsburg University Library, Germany. She is part of the library’s research data management and teaching teams. Since 2018, she has organized and taught different Library Carpentry workshops.
Michaela is a librarian at the medical library at Augsburg University Library, Germany, since 2018. She has a focus on information literacy.
Helge is subject and liaison librarian for medicine with University Library of Regensburg, Germany. He enjoys teaching and has a special interest in systematic searching, systematic reviews and other forms of evidence synthesis as well as research methodology. He is an EAHIL council member for Germany.