


default search action
19th BEA 2024: Mexico City, Mexico
- Ekaterina Kochmar, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anaïs Tack, Victoria Yaneva, Zheng Yuan:

Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2024, Mexico City, Mexico, June 20, 2024. Association for Computational Linguistics 2024, ISBN 979-8-89176-100-1 - Nicy Scaria

, Suma Dharani Chenna, Deepak N. Subramani:
How Good are Modern LLMs in Generating Relevant and High-Quality Questions at Different Bloom's Skill Levels for Indian High School Social Science Curriculum? 1-10 - Felix Stahlberg, Shankar Kumar:

Synthetic Data Generation for Low-resource Grammatical Error Correction with Tagged Corruption Models. 11-16 - Kostiantyn Omelianchuk, Andrii Liubonko, Oleksandr Skurzhanskyi, Artem N. Chernodub, Oleksandr Korniienko, Igor Samokhin:

Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models. 17-33 - Siyan Li, Teresa Shao, Julia Hirschberg, Zhou Yu:

Using Adaptive Empathetic Responses for Teaching English. 34-53 - Donya Rooein, Paul Röttger, Anastassia Shaitarova, Dirk Hovy:

Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts. 54-67 - Masamune Kobayashi, Masato Mita, Mamoru Komachi:

Large Language Models Are State-of-the-Art Evaluator for Grammatical Error Correction. 68-77 - Alexander Kwako, Christopher Michael Ormerod:

Can Language Models Guess Your Identity? Analyzing Demographic Biases in AI Essay Scoring. 78-86 - Victoria Yaneva, King Yiu Suen, Le An Ha, Janet Mee, Milton Quranda, Polina Harik:

Automated Scoring of Clinical Patient Notes: Findings From the Kaggle Competition and Their Translation into Practice. 87-98 - Scott A. Crossley, Perpetual Baffour, Mihai Dascalu, Stefan Ruseti:

A World CLASSE Student Summary Corpus. 99-107 - Nischal Ashok Kumar, Andrew S. Lan:

Improving Socratic Question Generation using Data Augmentation and Preference Optimization. 108-118 - Marie Bexte, Andrea Horbach, Lena Schützler, Oliver Christ, Torsten Zesch:

Scoring with Confidence? - Exploring High-confidence Scoring for Saving Manual Grading Effort. 119-124 - Michiel De Vrindt, Anaïs Tack, Renske Bouwer, Wim Van Den Noortgate, Marije Lesterhuis:

Predicting Initial Essay Quality Scores to Increase the Efficiency of Comparative Judgment Assessments. 125-136 - Ahatsham Hayat, Bilal Khan, Mohammad Hasan:

Improving Transfer Learning for Early Forecasting of Academic Performance by Contextualizing Language Models. 137-148 - Stefano Bannò, Hari Krishna Vydana, Kate M. Knill, Mark J. F. Gales:

Can GPT-4 do L2 analytic assessment? 149-164 - Charles Koutcheme, Nicola Dainese, Arto Hellas:

Using Program Repair as a Proxy for Language Models' Feedback Ability in Programming Education. 165-181 - Michael Ilagan, Beata Beigman Klebanov, Jamie N. Mikeska:

Automated Evaluation of Teacher Encouragement of Student-to-Student Interactions in a Simulated Classroom Discussion. 182-198 - Luisa Ribeiro-Flucht, Xiaobin Chen, Detmar Meurers:

Explainable AI in Language Learning: Linking Empirical Evidence and Theoretical Concepts in Proficiency and Readability Modeling of Portuguese. 199-209 - Nils-Jonathan Schaller, Yuning Ding, Andrea Horbach, Jennifer Meyer, Thorben Jansen:

Fairness in Automated Essay Scoring: A Comparative Analysis of Algorithms on German Learner Essays from Secondary Education. 210-221 - Alexander Scarlatos, Wanyong Feng, Andrew S. Lan, Simon Woodhead, Digory Smith:

Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank. 222-231 - Kevin Stowe, Benny Longwill, Alyssa Francis, Tatsuya Aoyama, Debanjan Ghosh, Swapna Somasundaran:

Identifying Fairness Issues in Automatically Generated Testing Content. 232-250 - Masato Mita, Keisuke Sakaguchi, Masato Hagiwara, Tomoya Mizumoto, Jun Suzuki, Kentaro Inui:

Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond. 251-265 - Matthew Durward, Christopher Thomson:

Evaluating Vocabulary Usage in LLMs. 266-282 - Maja Stahl

, Leon Biermann, Andreas Nehring, Henning Wachsmuth:
Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback Generation. 283-298 - Dominik Glandorf, Detmar Meurers:

Towards Fine-Grained Pedagogical Control over English Grammar Complexity in Educational Text Generation. 299-308 - Imran Chamieh, Torsten Zesch, Klaus Giebermann:

LLMs in Short Answer Scoring: Limitations and Promise of Zero-Shot and Few-Shot Approaches. 309-315 - Kosuke Doi, Katsuhito Sudoh, Satoshi Nakamura:

Automated Essay Scoring Using Grammatical Variety and Errors with Multi-Task Learning and Item Response Theory. 316-329 - Martha Shaka

, Diego Carraro, Kenneth N. Brown:
Error Tracing in Programming: A Path to Personalised Feedback. 330-342 - Ho Hung Lim, John Lee:

Improving Readability Assessment with Ordinal Log-Loss. 343-350 - Benjamin Paddags, Daniel Hershcovich, Valkyrie Savage:

Automated Sentence Generation for a Spaced Repetition Software. 351-364 - Tianwen Li, Zhexiong Liu, Lindsay Clare Matsumura, Elaine Wang, Diane J. Litman, Richard Correnti:

Using Large Language Models to Assess Young Students' Writing Revisions. 365-380 - Santiago Berruti, Arturo Collazo, Diego Sellanes, Aiala Rosá, Luis Chiruzzo:

Automatic Crossword Clues Extraction for Language Learning. 381-390 - Abigail Gurin Schleifer, Beata Beigman Klebanov, Moriah Ariely, Giora Alexandron:

Anna Karenina Strikes Again: Pre-Trained LLM Embeddings May Favor High-Performing Learners. 391-402 - Dan Carpenter, Wookhee Min, Seung Y. Lee, Gamze Ozogul, Xiaoying Zheng, James C. Lester:

Assessing Student Explanations with Large Language Models Using Fine-Tuning and Few-Shot Learning. 403-413 - Ricardo Muñoz Sánchez, Simon Dobnik, Elena Volodina:

Harnessing GPT to Study Second Language Learner Essays: Can We Use Perplexity to Determine Linguistic Competence? 414-427 - Kevin P. Yancey, Andrew Runge, Geoffrey T. LaFlair, Phoebe Mulcaire:

BERT-IRT: Accelerating Item Piloting with BERT Embeddings and Explainable IRT Models. 428-438 - Yuning Ding, Julian F. Lohmann, Nils-Jonathan Schaller, Thorben Jansen, Andrea Horbach:

Transfer Learning of Argument Mining in Student Essays. 439-449 - Allison Bradford, Kenneth Steimel, Brian Riordan, Marcia C. Linn:

Building Robust Content Scoring Models for Student Explanations of Social Justice Science Issues. 450-458 - Beata Beigman Klebanov, Michael Suhan, Tenaha O'Reilly, Zuowei Wang:

From Miscue to Evidence of Difficulty: Analysis of Automatically Detected Miscues in Oral Reading for Feedback Potential. 459-469 - Victoria Yaneva, Kai North, Peter Baldwin, Le An Ha, Saed Rezayi, Yiyun Zhou, Sagnik Ray Choudhury

, Polina Harik, Brian Clauser:
Findings from the First Shared Task on Automated Prediction of Difficulty and Response Time for Multiple-Choice Questions. 470-482 - Sebastian Gombert, Lukas Menzel, Daniele Di Mitri, Hendrik Drachsler:

Predicting Item Difficulty and Item Response Time with Scalar-mixed Transformer Encoder Models and Rational Network Regression Heads. 483-492 - Ana-Cristina Rogoz, Radu Tudor Ionescu:

UnibucLLM: Harnessing LLMs for Automated Prediction of Item Difficulty and Response Time for Multiple-Choice Questions. 493-502 - Mariano Felice, Zeynep Duran Karaoz:

The British Council submission to the BEA 2024 shared task. 503-511 - Anaïs Tack, Siem Buseyne, Changsheng Chen, Robbe D'hondt, Michiel De Vrindt, Alireza Gharahighehi, Sameh Metwaly, Felipe Kenji Nakano, Ann-Sophie Noreillie:

ITEC at BEA 2024 Shared Task: Predicting Difficulty and Response Time of Medical Exam Questions with Statistical, Machine Learning, and Language Models. 512-521 - Okan Bulut, Guher Gorgun, Bin Tan:

Item Difficulty and Response Time Prediction with Large Language Models: An Empirical Analysis of USMLE Items. 522-527 - Rishikesh Fulari, Jonathan Rusert:

Utilizing Machine Learning to Predict Question Difficulty and Response Time for Enhanced Test Construction. 528-533 - Gummuluri Venkata Ravi Ram, Ashinee Kesanam, Anand Kumar M:

Leveraging Physical and Semantic Features of text item for Difficulty and Response Time Prediction of USMLE Questions. 534-541 - George Dueñas, Sergio Jimenez, Geral Mateus Ferro:

UPN-ICC at BEA 2024 Shared Task: Leveraging LLMs for Multiple-Choice Questions Difficulty Prediction. 542-550 - Mehrdad Yousefpoori-Naeim, Shayan Zargari, Zahra Hatami:

Using Machine Learning to Predict Item Difficulty and Response Time in Medical Tests. 551-560 - Hariram Veeramani, Surendrabikram Thapa, Natarajan Balaji Shankar, Abeer Alwan:

Large Language Model-based Pipeline for Item Difficulty and Response Time Estimation for Educational Assessments. 561-566 - Álvaro Rodrigo, Sergio Moreno-Álvarez, Anselmo Peñas:

UNED team at BEA 2024 Shared Task: Testing different Input Formats for predicting Item Difficulty and Response Time in Medical Exams. 567-570 - Matthew Shardlow, Fernando Alva-Manchego, Riza Batista-Navarro, Stefan Bott, Saúl Calderón Ramírez, Rémi Cardon, Thomas François, Akio Hayakawa, Andrea Horbach, Anna Hülsing, Yusuke Ide, Joseph Marvin Imperial, Adam Nohejl, Kai North, Laura Occhipinti, Nelson Perez-Rojas, Nishat Raihan, Tharindu Ranasinghe, Martin Solis-Salazar, Sanja Stajner, Marcos Zampieri, Horacio Saggion:

The BEA 2024 Shared Task on the Multilingual Lexical Simplification Pipeline. 571-589 - Taisei Enomoto, Hwichan Kim, Tosho Hirasawa, Yoshinari Nagai, Ayako Sato, Kyotaro Nakajima, Mamoru Komachi:

TMU-HIT at MLSP 2024: How Well Can GPT-4 Tackle Multilingual Lexical Simplification? 590-598 - Sandaru Seneviratne, Hanna Suominen:

ANU at MLSP-2024: Prompt-based Lexical Simplification for English and Sinhala. 599-604 - Benjamin Dutilleul, Mathis Debaillon, Sandeep Mathias:

ISEP_Presidency_University at MLSP 2024 Shared Task: Using GPT-3.5 to Generate Substitutes for Lexical Simplification. 605-609 - Petru Cristea, Sergiu Nisioi:

Archaeology at MLSP 2024: Machine Translation for Lexical Complexity Prediction and Lexical Simplification. 610-617 - Ignacio Sastre, Leandro Alfonso, Facundo Fleitas, Federico Gil, Andrés Lucas, Tomás Spoturno, Santiago Góngora, Aiala Rosá, Luis Chiruzzo:

RETUYT-INCO at MLSP 2024: Experiments on Language Simplification using Embeddings, Classifiers and Large Language Models. 618-626 - Dhiman Goswami, Kai North, Marcos Zampieri:

GMU at MLSP 2024: Multilingual Lexical Simplification with Transformer Models. 627-634 - Anaïs Tack:

ITEC at MLSP 2024: Transferring Predictions of Lexical Difficulty from Non-Native Readers. 635-639

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














