Dr Martin Schweinberger

Researcher biography
Martin Schweinberger uses big data and computational methods to explore the messy, fascinating reality of how people actually talk—including all the swear words, filler words, and informal expressions that traditional language education overlooks. As a Lecturer in Applied Linguistics at the University of Queensland, he bridges the gap between computer science and linguistics to understand how language evolves in our digital age.
Uncovering Hidden Language Patterns
Much of Martin's research focuses on the language phenomena that schools don't teach but that permeate everyday conversation. He analyzes massive datasets to study vulgarity and swearing patterns, as well as discourse markers—those ubiquitous filler words like "like," "you know," "well," and "I mean" that pepper our speech. By applying statistical methods to real-world language use, he reveals how these supposedly "incorrect" forms of expression actually follow sophisticated social and linguistic rules.
His work also tracks how language changes over time and varies between different social settings, using computational tools to identify patterns that would be impossible to detect through traditional research methods alone.
Building Australia's Language Data Future
As Director of the Language Technology and Data Analysis Laboratory (LADAL)—a free upskilling platform for language data science with hundreds of thousands of users worldwide—and a key figure in one of Australia's major research infrastructure projects, the Language Data Commons of Australia (LDaCA), Martin is helping build the digital infrastructure that will support language research across the country. LDaCA has received substantial funding to create accessible tools and resources that allow researchers to analyze text and speech data more effectively.
Championing Research Transparency
Beyond his linguistic research, Martin advocates for reproducibility and transparency in humanities and social science research. He provides guidance on how language researchers can adopt more rigorous, open research practices—addressing a growing concern about the reliability of academic findings across disciplines.
Martin's international visibility is reflected in his leadership roles: he serves as Vice-President Professional of the International Society for the Linguistics of English (ISLE) and sits on the board of The International Computer Archive of Modern and Medieval English (ICAME), one of the oldest and most reputable societies for corpus linguistics. These positions demonstrate his commitment to advancing computational language research on a global scale.
Potential topics for supervision
I would be particularly interested in supervising theses on the following topics:
Sociolinguistics / Language Variation and Change / World Englishes
- General extenders
- Terms-of-address and salutations
- Discourse particles and markers
- Vulgarity
- Adjective amplification
Learner Language / Applied Linguistics / Corpus Phonetics / Learner Corpus Research
- Vowel production among L1 speakers and learners of English
- Voice-onset-times among L1 speakers and learners of English
- Fluency and pauses in learner and L1 speech.
- Accent and intelligibility / comprehension.
Text Analytics / Digital Humanities / Corpus Linguistics
- Applied word embedding applications in the language sciences.
- Comparison of different association / keyness measures
Featured projects | Duration |
---|---|
Language Data Commons of Australia (LDaCA) and Australian Text Analytics Platform (ATAP) ARDC Co-investment Project |
2021–2028 |