Conference Agenda

Digital Infrastructures
Friday, 20/Mar/2020:
11:20am - 12:50pm

Session Chair: Veronika Laippala
Location: Hall B

Short Paper (10+5min)

CLARIN in Latvia: from the Preparatory Phase to the Construction Phase and Operation

Inguna Skadina, Ilze Auziņa, Normunds Grūzītis, Arturs Znotiņš

The Institute of Mathematics and Computer Science, University of Latvia, Latvia

Qualitative language resources and tools for natural language processing are key elements for research in digital humanities (DH). Several research infra-structures, e.g., CLARIN, DARIAH, provide access to digital research objects for researchers around the Europe. Although these are pan-European research infrastructures, availability of content and the readiness of the particular node varies from country to country. This paper aims to present current status of the CLARIN research infrastructure in Latvia – key language resources and tools identified, development of the technical infrastructure, collaboration with DH researches and initiatives on user involvement and education. Being active participant of the CLARIN initiative during its preparation phase, Latvia joined CLARIN ERIC only four years after its establishment. This four-year gap puts Latvia’s node in a construction phase, while in many countries CLARIN is already operational. Although many Latvian language resources and tools are currently not included in CLARIN repository, researchers of Latvia already now can benefit from the language resources and tools from different members of CLARIN ERIC through single sign-on.

Short Paper (10+5min)

Linked Open Data Infrastructure for Digital Humanities in Finland

Eero Hyvönen

University of Helsinki (HELDIG) and Aalto University, Finland

This paper presents and overviews ''Linked Open Data Infrastructure for Digital Humanities in Finland (LODI4DH)'', a joint initiative of Aalto University, Department of Computer Science, and University of Helsinki (UH), HELDIG Centre for Digital Humanities, for creating a centralized national data infrastructure and Linked Data services for open science. The services enable publication and utilization of datasets for data-intensive Digital Humanities (DH) research in structured, standardized formats via open interfaces. LODI4DH is based on a large national collaboration network and software created during a long line of national projects in DH between UH and Aalto since 2002 that created several in-use infrastructure prototypes, such as the ONKI and Finto ontology service now at the National Library of Finland, the Linked Data Finland platform, and the ''Sampo" series of semantic portals testing and demonstrating the usability of the approach. Thus far, these systems have had millions of end-users on the Web suggesting a high potential of utilizing the technology and Linked Data infrastructure.

Long Paper (20+10min)

Evaluating a DH Tool; the First 18 Months of the Gale Digital Scholar Lab and the Future of Academic/Corporate Partnerships

Christopher Michael Houghton

Gale, A Cengage Company, US

This paper will discuss lessons learned in the first 18 months of release of Gale Digital Scholar Lab; a ground-breaking tool designed to make digital scholarship methods more accessible and vastly reduce the time needed to run digital humanities projects. By taking a global view of the users of the Lab, this paper will illustrate regional trends in use, as well as highlighting the key lessons from researchers and academics around the world.

Since 2011, Gale have been working with academics globally to provide access to the OCR and metadata of its world-famous digital archives, including ECCO (Eighteenth Century Collections Online) and the Times Digital Archive.

In 2014, following the decision to make this data available more formally on drives, Gale has kept in touch with many of the researchers in receipt of this data to understand their projects and ideally, the challenges they face in using this data in digital humanities projects.

As a result of this research, Gale identified three common challenges faced by researchers around the world when conducting digital humanities projects. Firstly, the time taken to bring together a significant corpus, clean it and prepare it for analysis often stretch to many months and proved to be prohibitive for many researchers. Secondly, hosting data was an expensive and labour-intensive process, requiring significant institutional infrastructure that proved to be an obstacle for many. Finally, learning the coding languages necessary to create analytical tools was a challenge for many, especially when considered in the framework of the undergraduate classroom.

Subsequently, Gale began building a tool to meet and ideally, mitigate these challenges. Creating a tool that would be as useful to researchers in Beijing as those in Birmingham proved to be a significant undertaking, and took over four years of development, including one year of active development, at a cost of $2 Million.

In September 2018, we released Gale Digital Scholar Lab, a cloud-hosted text and data mining environment, bringing up to 166 Million pages (to date) of Gale’s leading digital archives together with powerful text mining and natural language processing tools. With an aim of drastically reducing the time needed to construct a research corpus, clean large sets of data, customise and run analyses and teach sophisticated digital scholarship methods, Gale Digital Scholar Lab proved to be an extremely popular product.

The launch of the Lab proved a significant evolution in Gale’s relationship with academia, as we found ourselves more frequently partnering with academics on projects related to digital humanities. One area of common focus involved collaborating on pedagogies and working together to construct curricula to widen the teaching of digital humanities, with a specific focus on the undergraduate classroom. Increasingly, institutions around the world looked to Gale to assist them in using the Gale Digital Scholar Lab to teach digital methods to humanities students. Not wishing to insert ourselves unnecessarily into the academic process, this proved a great opportunity to collaborate with leading institutions on methods of using the Lab, as part of a suite of tools and techniques, to spread digital humanities methods throughout the HSS department. To this end, Gale began employing academics to collaborate with institutions on creating curricula and teaching.

Alongside this, there proved to be significant and frequent opportunities to partner with academics to create open tools that could be adapted for inclusion into the Lab. This not only allowed us to support valuable research, but also to ensure that tools were created that allowed all users of Gale digital archives to make discoveries and explore them in new and potentially interesting ways.

Working in these new, collaborative ways with academics proved to be both stimulating and challenging for those of us at Gale. It has been particularly noteworthy that the rise in Gale data being used in digital humanities has caused us to ask questions of OCR, metadata, structure, provenance and framing of archives. There is no question that digital humanities has asked challenged the way in which we present archival material and has changed the way we think about putting archives together and presenting them for research.

This paper will break down the first 18 months’ usage of the Lab globally, highlighting regional trends and tendencies. Allied to this, the paper will discuss the most common requests for future development and explain Gale’s ongoing commitment to evolving the Lab to meet the needs of the global DH community by presenting the development roadmap. By discussing the various partnerships and collaborations, we will show Gale’s commitment to growing and amplifying digital humanities research and supporting the values of openness, breaking down barriers and furthering the cause of humanities and social science research.

Short Paper (10+5min)

Studying Transnational Digital Spaces: Methodological Vistas and Challenges

Anastasia A. Ivanova

Saint Petersburg University, Saint Petersburg, Russian Federation

This paper presents and compares two on-going studies of transnational digital spaces that apply partly similar, partly different research methods to analysis of different empirical phenomena: transnational migration and transhumanist network. We would outline research questions and research tools of each of the studies and then presents reflections of vistas and challenges for doing social science research on-line.