Research and Publications
Here you can find my papers and other academia-related projects I’ve been a part of. You can see the talks and presentations I’ve given here.
First Author Papers
-
Ricardo Muñoz Sánchez, David Alfter, Simon Dobnik, Maria Irena Szawerna, Elena Volodina. “Jingle BERT, Jingle BERT, Frozen All the Way: Freezing Layers to Identify CEFR Levels of Second Language Learners Using BERT”. NLP4CALL 2024. (link, slides)
-
Ricardo Muñoz Sánchez, Simon Dobnik, Elena Volodina. “Harnessing GPT to Study Second Language Learner Essays: Can We Use Perplexity to Determine Linguistic Competence?”. BEA 2024 Workshop, co-located with NAACL 2024. (link, slides, and poster)
-
Ricardo Muñoz Sánchez. “When Hieroglyphs Meet Technology: A Linguistic Journey through Ancient Egypt Using Natural Language Processing”. LT4HALA 2024 Workshop, co-located with LREC-COLING 2024. (link, slides)
-
Ricardo Muñoz Sánchez, Simon Dobnik, Maria Irena Szawerna, Therese Lindström Tiedemann, Elena Volodina. “Did the Names I Used within My Essay Affect My Score? Diagnosing Name Biases in Automated Essay Scoring”. CALD-Pseudo Workshop, co-located with EACL 2024. (link, slides)
-
Ricardo Muñoz Sánchez*, Eric Johansson*, Shakila Tayefeh*, Shreyash Kad*. “A First Attempt at Unreliable News Detection in Swedish”. Rest-UP 2 Workshop, co-located with LREC 2022. (link, slides)
-
Seraphina Goldfarb-Tarrant*, Rebecca Marchant*, Ricardo Muñoz Sánchez*, Mugdha Pandya*, Adam Lopez. “Intrinsic Bias Metrics Do Not Correlate with Application Bias”. ACL-IJCNLP 2021. (link)
* equal contributions
Other Papers
-
Arianna Masciolini, Andrew Caines, Orphée De Clercq, Joni Kruijsbergen, Murathan Kurfalı, Ricardo Muñoz Sánchez, Elena Volodina, Robert Östling. “The MultiGEC-2025 Shared Task on Multilingual Grammatical Error Correction at NLP4CALL”. NLP4CALL 2025, co-located with NoDaLiDa/Baltic-HLT 2025. (link, shared task website)
-
Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, Xuan-Son Vu, Elena Volodina. “The Devil’s in the Details: the Detailedness of Classes Influences Personal Information Detection and Labeling”. NoDaLiDa/Baltic-HLT 2025. (link)
-
Maria Irena Szawerna, Simon Dobnik, Therese Lindström Tiedemann, Ricardo Muñoz Sánchez, Xuan-Son Vu, Elena Volodina. “Pseudonymization Categories across Domain Boundaries”. LREC-COLING 2024. (link)
-
Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, Therese Lindström Tiedemann, Elena Volodina. “Detecting Personal Identifiable Information in Swedish Learner Essays”. CALD-Pseudo Workshop, co-located with EACL 2024. (link)
-
Dimitrios Kokkinakis, Ricardo Muñoz Sánchez, Mia-Marie Hammarlin. “Scaling-up the Resources for a Freely Available Swedish VADER (svVADER)”. NoDaLiDa 2023. (link)
-
Dimitrios Kokkinakis, Ricardo Muñoz Sánchez, Sebastianus Bruinsma, Mia-Marie Hammarlin. “Investigating the Effects of MWE Identification in Structural Topic Modelling”. 19th Workshop on Multiword Expressions, co-located with EACL 2023. (link, slides)
Posters and Non-Archival Presentations
-
Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, Elena Volodina. “Swedish Learner Essays Revisited: Further Insights into Detecting Personal Information”. SLTC 2024. (abstract)
-
Ricardo Muñoz Sánchez, Simon Dobnik, Therese Lindström Tiedemann, Maria Irena Szawerna, Elena Volodina. “Name Biases in Automated Essay Assessment”. ICOS 28, 2024. (abstract, poster)
Other Publications
- Arianna Masciolini, Andrew Caines, Orphée De Clercq, Joni Kruijsbergen, Murathan Kurfalı, Ricardo Muñoz Sánchez, Elena Volodina, Robert Östling, Kais Allkivi, Špela Arhar Holdt, Ilze Auzin̦a, Roberts Darģis, Elena Drakonaki, Jennifer-Carmen Frey, Isidora Glišic, Pinelopi Kikilintza, Lionel Nicolas, Mariana Romanyshyn, Alexandr Rosen, Alla Rozovskaya, Kristjan Suluste, Oleksiy Syvokon, Alexandros Tantos, Despoina-Ourania Touriki, Konstantinos Tsiotskas, Eleni Tsourilla, Vassilis Varsamopoulos, Katrin Wisniewski, Aleš Žagar, Torsten Zesch. “An overview of Grammatical Error Correction for the twelve MultiGEC-2025 languages” in GU-ISS Forskningsrapporter från Institutionen för svenska, flerspråkighet och språkteknologi (2025). (link)
- A publication meant to provide some context for the MultiGEC-2025 shared task. We show how most papers in ACL Anthology that do GEC focus solely on English, leaving many languages behind with little to no support.
- Stian Rødven-Eide, Ricardo Muñoz Sánchez. “Detecting fake papers with the latent algorithm for recursive search” in Elena Volodina, Dana Dannélls, Aleksandrs Berdicevskis, Markus Forsberg, Shafqat Virk (Eds.) LIVE and LEARN - Festschrift in honor of Lars Borin (2022). (link)
- As part of a festschrift for Lars Borin’s 65th birthdat celebration, me and another PhD colleague wrote a tongue-in-cheek paper about a system that detects fake papers, but it detects its own paper as a fake one.
Language Resources
- Arianna Masciolini, Andrew Caines, Orphée De Clercq, Joni Kruijsbergen, Murathan Kurfalı, Ricardo Muñoz Sánchez, Elena Volodina, Robert Östling, Kais Allkivi-Metsoja, Špela Arhar Holdt, Ilze Auzin̦a, Roberts Darģis, Elena Drakonaki, Jennifer-Carmen Frey, Isidora Glišic, Pinelopi Kikilintza, Lionel Nicolas, Mariana Romanyshyn, Alexandr Rosen, Alla Rozovskaya, Kristjan Suluste, Oleksiy Syvokon, Alexandros Tantos, Despoina-Ourania Touriki, Konstantinos Tsiotskas, Eleni Tsourilla, Vassilis Varsamopoulos, Katrin Wisniewski, Aleš Žagar, Torsten Zesch. “MultiGEC” (2025). (link)
- The MultiGEC dataset is meant for Multilingual Grammatical Error Correction. It contains 12 European languages (Czech, English, Estonian, German, Greek, Icelandic, Italian, Latvian, Russian, Slovene, Swedish and Ukrainian). For more information, check the dedicated website.
Blog Posts
- Ricardo Muñoz Sánchez, Arianna Masciolini. “Språkbanken Students at LREC-COLING 2024” in the Språkbanken Text blog (June 10th, 2024). (link)
- The students from Språkbanken Text were heavily represented at LREC-COLING 2024, both on-site and on-line. Here we talk about our experiences during the conference.
- Maria Irena Szawerna, Ricardo Muñoz Sánchez. “The Lions, the Words, and the Workshops: Språkbanken Text at EACL 2024” in the Språkbanken Text blog (April 11th, 2024). (link)
- Maria and I talk about our experience at EACL 2024.
- Tanja Seppälä, Ricardo Muñoz Sánchez, Márton András Tóth. “Neighbours with different practices - Doctoral education in Finland and in Sweden in the linguistics” in Kielingua (December 10th, 2021). (link)
- A blog post in which we compare our experiences as PhD stundents in Finland and in Sweden.
Dissertations
-
MSc dissertation project: “Exploring the Relationship Between Intrinsic and Extrinsic Bias Metrics in Spanish Word Embeddings” (2020). Advisors: Seraphina Goldfarb-Tarrant and Adam Lopez. (link)
-
BSc dissertation project: “Maximal Chains of Isomorphic Substructures of the Random Graph” (2018). Advisor: David Meza Alcántara. (link)
Proceedings Editorial Team
-
Ricardo Muñoz Sánchez, David Alfter, Elena Volodina, Jelena Kallas (Eds.). “Proceedings of the 14th Workshop on Natural Language Processing for Computer Assisted Language Learning”. University of Tartu Library. Workshop Co-Located with NoDaLiDa/Baltic-HTL 2025. Tallinn, Estonia. (link)
-
Elena Volodina, David Alfter, Simon Dobnik, Therese Lindström Tiedemann, Ricardo Muñoz Sánchez, Maria Irena Szawerna, Xuan-Son Vu (Eds.). “Proceedings of the Workshop on Computational Approaches to Language Data Pseudonymization (CALD-pseudo 2024)”. Association for Computational Linguistics. Workshop Co-Located with EACL 2024. St. Julian’s, Malta. (link)
Master Thesis Supervision
- Tom Södahl. “Don’t Mention the Norm” (2024). Advisors: Ricardo Muñoz Sánchez and Elena Volodina. (link)
Co-Organizing
-
NLP4CALL 2025 Workshop, co-located with NoDaLiDa/Baltic-HLT 2025. Main organizer. (link)
-
Multi-GEC 2025 Shared Task, co-located with NLP4CALL 2025 and NoDaLiDa/Baltic-HLT 2025. Co-organizer. (link)
-
Privacy and AI: Towards a Trustworthy Ecosystem (AITrust) Workshop, co-located with WASP-HS 2024. Organizing co-chair. (link)
-
CALD-Pseudo Workshop, co-located with EACL 2024. Organizing co-chair. (link)
-
Workshop on ethics for research and teaching in natural language processing (2024). Local organizer (link)
-
Open House at Mormor Karl’s (2024). Co-organizer. (link)
-
Kickoff for Grandma Karl’s (2023). Co-organizer (link)
Volunteering
-
NAACL 2024 (link)
-
LREC-COLING 2024 (link)
-
Huminfra Conference 2024 (link)
-
ACL-IJCNLP 2021, online (link)
-
NAACL 2021, online (link)
Reviewing
I have reviewed for the following venues:
- RESOURCEFUL Workshop: 2025
- NLP4CALL Workshop: 2024, 2025
- NoDaLiDa/Baltic-HLT: 2025
- BEA Workshop: 2024
- CALD-Pseudo Workshop: 2024
- EMNLP: 2023, 2024
- RaPID Workshop: 2022, 2024