Dirk Geeraerts is a professor emeritus of linguistics at the University of Leuven, Belgium. His main research area involves the overlapping fields of lexical semantics, lexicology, and lexicography, with a theoretical focus on cognitive linguistics and a descriptive focus on lexical variation and change. A prominent member of the first generation of cognitive linguists, he played an instrumental role in the international expansion of cognitive linguistics, as the founder of the journal Cognitive Linguistics and as the editor (with Hubert Cuyckens) of the Oxford Handbook of Cognitive Linguistics. From 1995 to 2005, he was editor-in-chief, with T. den Boon, of the Van Dale Groot Woordenboek van de Nederlandse Taal. His publications include the following monographs:
1985 Paradigm and Paradox. Leuven: Leuven University Press
1994 (with S. Grondelaers and P. Bakema) The Structure of Lexical Variation. Berlin: Mouton de Gruyter
1997 Diachronic Prototype Semantics. Oxford: OUP
2000 (with S. Grondelaers and D. Speelman) Convergentie en divergentie in de
Nederlandse woordenschat. Amsterdam: Meertens
2006 Words and Other Wonders. Papers on Lexical and Semantic Topics. Berlin: Mouton de Gruyter
2010 Theories of Lexical Semantics. Oxford: OUP
2017 Conceptual Structure and Conceptual Variation. Shanghai: Foreign Language Education Press
2018 Ten Lectures on Cognitive Sociolinguistics. Leiden: Brill
2024 (with D. Speelman, K. Heylen, M. Montes, S. De Pascale, K. Franco and M. Lang) Lexical Variation and Change. A Distributional Semantic Approach . Oxford: OUP.
Lexicography and theories of lexical semantics (and the other way round)
In an attempt to provide some background for the overall theme of the Euralex 21 conference, the talk explores the dialectic relationship between lexical-semantic theory and lexicographical practice, with specific attention to the plurality of theories of word meaning. The talk takes a historical perspective. As a first step, for each of the major theoretical approaches to word meaning that have successively emerged in the course of the last 150 years (historical-philological semantics, structuralist, and neostructuralist semantics, and cognitive semantics), the talk considers what their contribution to lexicography has been. Specifically, the various types of inspiration provided by cognitive linguistics (like prototypicality, frame semantics, metaphor research, and construction grammar) will be highlighted. As a second step, reversing the perspective, the talk has a look at the stimuli that went out from dictionary projects to semantic theory. In particular, it will be emphasized that lexicography gave a major impetus to the development of corpus linguistics, which itself was instrumental in the emergence of usage-based theories of meaning and language. As a third step, the talk analyzes how these two lines of development – from lexical theory to lexicography, and from lexicography to lexical theory – are woven together in the current situation, with specific attention to distributional semantics and corresponding vector-based methods.
Tony Veale is the outgoing chair of the International Association for Computational Creativity, and the author of several monographs on the topic of creative language generation, including Exploding The Creativity Myth: The Computational Foundations of Linguistic Creativity (Bloomsbury, 2012), Twitterbots: Making Machines That Make Meaning (with Mike Cook; MIT Press, 2017) and Your Wit Is My Command: Building AIs with a Sense of Humor (MIT Press, 2021). He has researched the crossover between AI and language for three decades in academia and in industry. He teaches Computational Creativity and Generative AI as an associate professor in UCD’s school of Computer Science.
YOU TALK FUNNY! ONE DAY ME TALK FUNNY TOO: Investigating the capacity of Large Language Models for Humour
Recent developments in the scaling and training of large language models (LLMs) have led to a dramatic change in how the public views Artificial Intelligence. No longer the vaguely aspirational preserve of science fiction stories, AI is now expected to work, and not just in the laboratory but in a wide range of consumer products. Yet as AI outperforms people on tasks that were once considered yardsticks of human intelligence, one area of yhuman experience still holds out, for now at least: our very human sense of humour. This is not for want of trying, as this talk will show. There is good reason for computer science to take humour seriously, By building computer systems with a sense of humour, capable of appreciating the jokes of human users or even of generating jokes of their own, we can turn academic theories into practical realities that amuse, explain, provoke, and delight. The writer Clive James once pronounced that one should not trust anyone lacking a sense of humour, even, indeed, to post a letter, for what is humour but our sense of equanimity and poise in the face of the unpredictable when common sense has been pushed to the brink? My talk will describe where researchers are on this road to more humorous machines, and explore how we might go further towards giving LLMs a robustly human funny bone. The talk will also cover related issues such as acceptability and value alignment in LLMs, since humour often pushes the bounds of what is socially acceptable in polite company.
Kory Stamper has been a lexicographer for twenty-six years. Her career began at Merriam-Webster, where she was trained in lexicography by E. Ward Gilman, and continued at Cambridge Dictionaries and Dictionary.com, where she is currently the Senior Editor of Lexicography leading a team of lexicographers and thesaurists. She has written dictionaries and thesauruses for native speakers of English and English-language learners, and specializes in the analysis and reorganization of lexical data inside dictionaries and thesauruses with the goal of presenting salient lexicographical information more easily to a wide variety of users. Her writing on language and lexicography for general audiences has appeared in The New York Times, the Guardian, the Washington Post, and the Times Literary Supplement, and in her best-selling book Word By Word: The Secret Life of Dictionaries. She is the current president of the Dictionary Society of North America, and a member of the Language Council of the Miami Nation of Indiana, where she practices relational lexicography in the revitalization and recording of the Eastern Myaamia dialect.
Case Studies in the Successes and Limits of Frame Semantics in Practical Lexicography
Practical, commercial lexicography in the United States, in particular, is a field that relies heavily on tradition, and it has been loath to abandon the tried-and-true methods of corpus creation, analysis, and defining that have been established since the time of Murray. Yet frame semantics has provided a broader lens through which the practical lexicographer can view meaning, and its integration (though slow) into the practice of lexicography has yielded defining methods that are more user-oriented and while give the lexicographer tools to move beyond their own unconscious or implicit biases–something that is increasingly important in successful modern lexicography. But technological and social changes in the last several decades–the ease with which mis- and disinformation moves into the mainstream, the rise of generative AI and the regular presentation of generated text as natural language, the proliferation of varieties of English accessible to the lexicographer that are sometimes themselves removed from context, and the changing ways in which online dictionaries are used–have presented difficulties to the practical lexicography who seeks to integrate frame semantics deeper into their practice. This paper will present case studies on the successes of integrating frame semantics into lexicographical practice, and the current challenges that lexicographers face when the “frame” itself is illusory, shifting, or debated.
Tiago Timponi Torrent is a Cognitive Linguist working on Multimodal Natural Language Processing within the framework of Frame Semantics and Construction Grammar. He is the head of the FrameNet Brasil Computational Linguistics Lab, PI of ReINVenTA – Research and Innovation Network for Video and Text Analysis of Multimodal Objects – and Professor of the Graduate Program in Linguistics at the Federal University of Juiz de Fora, Brazil. He is a Research Productivity Grantee of the Brazilian National Council for Scientific and Technological Development (CNPq), and winner of the 2021 edition of the Technology in Linguistic Research Award of the Brazilian Linguistics Association (ABRALIN). Tiago Torrent served as a Guest Professor at the Department of Swedish, Multilingualism and Language Technology at the University of Gothenburg. He is the one of the co-authors of Copilots for Linguists: AI, Constructions, and Frames.
Possible Futures for Semantic Lexical Resources in the Age of Artificial Intelligence
Large Language Models (LLMs) have been dominating the discussion fora on language technology for at least the past seven years. As much as LLMs have spurred progress in NLP, recent research has been demonstrating their performance seems to reach a limit which cannot be overcome with more training data. Therefore, hybrid approaches combining LLMs and Language Resources have been gaining momentum. In this talk I explore possible futures for research in semantic lexical resources in combination with LLMs and AI techniques. As examples of possible research paths, I discuss the application of the FrameNet model to the development of a tool for identifying territories prone to suffer from gender based violence, as well as to the growing field of multimodal NLP.le in polite company.
Lana Hudeček is a Senior Research Fellow (equivalent to Full Professor) at the Institute for the Croatian Language. She serves as the principal investigator and head of the Croatian Web dictionary – Mrežnik project. This project was initially funded by the Croatian Science Foundation from 2017 to 2021, later became an internal project of the Institute for the Croatian Language from 2021 to 2023, and in 2024, received funding from the European Union within the NextGenerationEU program. Between 2007 and 2011, she led the Croatian Normative Desk Dictionary project (Hrvatski normativni jednosveščani rječnik), funded by the Ministry of Science and Education. This project culminated in the creation of two dictionaries: The First School Dictionary of the Croatian Language (Prvi školski rječnik hrvatskoga jezika), published in 2008, and The School Dictionary of the Croatian Language (Školski rječnik hrvatskoga jezika), published in 2012. Both are also available online at rjecnik.hr. Her primary areas of interest include lexicography, terminology, standard language, and language planning. She has collaborated on numerous international and Croatian projects and is the author or co-author of many monographs and papers. Her contributions have been recognized through six awards received in collaboration with co-authors.
Croatian Web Dictionary – MREŽNIK
From 2017 until 2021, the Croatian Web Dictionary – Mrežnik was a project of the Croatian Science Foundation; from 2022 until now, it has been an internal project of the Institute of Croatian Language, and from 2024, it will be funded by the EU program NextGeneration EU. The project goals are to compile an e-dictionary of the Croatian language that is online, free, corpus-based, monolingual, hypertext, searchable, normative, and based on the contemporary results of e-lexicography and computational linguistics. Mrežnik consists of three modules: for adult native speakers of Croatian, schoolchildren, and non-native speakers of Croatian. It will be the central meeting point of the existing language resources of the Institute of Croatian Language and Linguistics but also of all language resources created within the project. Croatian Web Dictionary – Mrežnik is conceived as a dynamic dictionary that will be further compiled and edited even after the end of the NextGeneration EU project, as it is a long-term project of the Institute for the Croatian Language. The reason for launching the Mrežnik project was primarily because in 2016, at the time of the project application, Croatia was still one of the countries that did not have an online dictionary of their national language compiled according to the rules of contemporary e-lexicography. The need for extensive scientific research in e-lexicography was also recognized, i.e., getting to know the theory and practice of creating e-dictionaries and the possibilities that new dictionary platforms offer. Mrežnik is compiled taking into account semantic relations and the systematic nature of language. The systematic nature of the dictionary can be seen in almost all areas: accentuation of entry words, the selection and accentuation of forms in the grammatical block, the definition of words that belong to closed grammatical and semantic groups, etc. The two essential computer tools for compiling this three-module dictionary are Sketch Engine, a corpus query system (loaded with the corpora) to support language analysis, and TLex, a dictionary writing system. Word Sketches are specially adapted to the needs of the project and are based on developed Sketch Grammar. In 2022, a part of the dictionary (A – F) was exported from TLex to both the web application (https://rjecnik.hr/mreznik/) and the CLARIN European science infrastructure repository (clarin.si repository and the github.com public data management system). The presentation will focus on the corpora and wordlist(s), normative and pragmatic aspects of Mrežnik, micro- and macrostructure of Mrežnik, and the place of grammar in Mrežnik. The fact that Mrežnik is the first gamified Croatian web dictionary and the first dictionary with recorded pronunciation will be stressed. The comparison of the three modules will also be addressed, and it will be shown that the center of all lexicographic decisions was always the user.