Terttu Nevalainen - Terttu is the Director of the VARIENG Research Unit. Her research interests include historical sociolinguistics, variation studies, corpus linguistics and digital humanities. She is one of the designers and compilers of the Helsinki Corpus of English Texts and of the Corpus of Early English Correspondence.

Turo Hiltunen - Turo Hiltunen works as senior lecturer in English at the Department of Languages, University of Helsinki, Finland. His research interests include corpus linguistics and register analysis, and the grammatical and phraseological variation of scientific English past and present, compilation of specialised corpora, and democratisation in the context of parliamentary discourse.

Aatu Liimatta - Aatu Liimatta is researching register variation on social media for his PhD. His interests include register and functional variation, computational methods, and big linguistic data.

Title: Introducing the VARIENG research unit in Helsinki


Our presentation falls into three parts. It begins with Terttu’s brief introduction to the Research Unit for Variation, Contacts and Change in English (VARIENG), its past, present and future, and its current open-access resources. In addition, the talk presents two short cases studies touching on the methodology of corpus linguistics from different perspectives. The first one (by Turo) discusses how the availability of massive text archives may hold great promise for corpus linguistic work, but they may also present considerable methodological challenges for users (see e.g. Hiltunen, McVeigh, and Säily 2017). In focus here are some specific problems related to the diachronic British Library Newspapers database, and how those problems might be addressed in the context of register analysis and the study of linguistic variation. The second case study (by Aatu) looks into the role of text length in register-internal variation by analyzing a big data sample of social media comments. Different registers seem to exhibit different kinds of internal functional variation by text length (as shown by the illustration).

About The Language Technology and Data Analysis Laboratory (LADAL) Webinars 2021

The Language Technology and Data Analysis Laboratory (LADAL) is school-based support infrastructure for computational humanities research established and maintained by the UQ School of Languages and Cultures. The LADAL is part of the ARDC Australian Text Analytics Platform (ATAP) which represents a nation-wide attempt to foster computational skills in HASS. It collaborates with and shares expertise with several Australian and international centres, institutions, researchers, and experts.

The LADAL consists of a specialist computing lab for language-based computational and experimental work (the Computational and Experimental Workshop) and an online virtual lab (the LADAL website). The LADAL website offers self-guided study materials and hands-on tutorials on topics relating to digital tools, computational methods for data extraction and processing, data visualization, statistical analyses of language data, and provides links to further resources and short descriptions of digital tools relevant for digital HASS research. In addition, the LADAL offers face-to-face consultations and specialized workshops. SLC researchers are encouraged to contact LADAL staff for advice and guidance on matters relating to digital research tools, data visualization, various statistical procedures, and text analytics.  As such, the LADAL offers pathways to new research possibilities in HASS with a focus on computational quantitative text analytics.