Ricardo’s PhD Project

2 minute read

NLP for Second Language Learning as a Case Study for Bias and Fairness in AI

Here you can find a bit more information about my PhD topic and things related to it, including information about previous seminars and the papers that I plan to include in it. For more information about my project, feel free to check out my website.

A Brief Blurb

From the abstract of my halfway seminar:

Algorithmic accountability has been an ever-expanding field in AI for the last decade. Two heavily-interlinked topics in this field are bias and fairness. The former studies how (unwanted) social biases are reflected in machine learning systems and how they affect them. On the other hand, the latter studies how these biases can affect downstream application of these systems.

Despite all of the potential interactions with other subfields of NLP, we still see most of the bias and fairness research being done both in relative isolation (rather than focusing on how it interacts with the different fields) and within a relatively limited set of notions of bias.

The first part of this talk will focus on how we can explore semantic and grammatical representations of biases in language models. Later on, I will explain how I’ve been exploring NLP for second language learning as a case study of how biases can be studied within a specific field of NLP. I will then talk about other forays I have done throughout my PhD into different areas of algorithmic accountability and the connections they have with my current topic. Finally, the talk will close with some of the future directions in which I would like to take my research.

Seminars and Presentations

  • Final Seminar
    • Date: February 23rd, 2026 (planned)
    • Opponent: Beáta Megyesi
  • Halfway Seminar
    • Title: “From Algorithms to Classrooms: NLP for Second Language Learning as a Case Study for Bias and Fairness in AI
    • Date: November 18th, 2024
    • You can find the slides for the presentation here
  • Idea Seminar
    • Title: “Using the Flow of Information to Detect False News
    • Date: January 23rd, 2023
    • You can find the slides for the presentation here

Papers Included in the Thesis

  • Tom Södahl Bladsjö, Ricardo Muñoz Sánchez. “Introducing MARB — A Dataset for Studying the Social Dimensions of Reporting Bias in Language Models”. 6th Workshop on Gender Bias in Natural Language Processing, co-located with ACL 2025. (link)

  • Ricardo Muñoz Sánchez, Simon Dobnik, Elena Volodina. “Harnessing GPT to Study Second Language Learner Essays: Can We Use Perplexity to Determine Linguistic Competence?”. BEA 2024 Workshop, co-located with NAACL 2024. (link)

  • Ricardo Muñoz Sánchez, David Alfter, Simon Dobnik, Maria Irena Szawerna, Elena Volodina. “Jingle BERT, Jingle BERT, Frozen All the Way: Freezing Layers to Identify CEFR Levels of Second Language Learners Using BERT”. NLP4CALL 2024. (link)

  • Ricardo Muñoz Sánchez, Simon Dobnik, Maria Irena Szawerna, Therese Lindström Tiedemann, Elena Volodina. “Did the Names I Used within My Essay Affect My Score? Diagnosing Name Biases in Automated Essay Scoring”. CALD-Pseudo Workshop, co-located with EACL 2024. (link)

  • Upcoming paper on the effects of L1 on CEFR classification (in progress)