


default search action
22nd MSR 2025: Ottawa, ON, Canada
- 22nd IEEE/ACM International Conference on Mining Software Repositories, MSR@ICSE 2025, Ottawa, ON, Canada, April 28-29, 2025. IEEE 2025, ISBN 979-8-3315-0183-9

- Yi-Hung Chou, Yiyang Min, April Yi Wang, James A. Jones:

Learning from Mistakes: Understanding Ad-hoc Logs through Analyzing Accidental Commits. 1-13 - Xhulja Shahini, Jone Bartel, Klaus Pohl:

On the calibration of Just-in-time Defect Prediction. 14-26 - Dingbang Wang

, Zhaoxu Zhang, Sidong Feng, William G. J. Halfond, Tingting Yu:
An Empirical Study on Leveraging Images in Automated Bug Report Reproduction. 27-38 - Shrey Tiwari, Serena Chen, Alexander Joukov, Peter Vandervelde, Ao Li

, Rohan Padhye
:
It's About Time: An Empirical Study of Date and Time Bugs in Open-Source Python Software. 39-51 - Emanuela Guglielmi

, Andrea D'Aguanno, Rocco Oliveto, Simone Scalabrino:
Enhancing Just-In-Time Defect Prediction Models with Developer-Centric Features. 52-62 - Md Nakhla Rafi, An Ran Chen, Tse-Hsun Peter Chen, Shaohua Wang:

Revisiting Defects4J for Fault Localization in Diverse Development Scenarios. 63-75 - Dylan Callaghan

, Bernd Fischer:
Mining Bug Repositories for Multi-Fault Programs. 76-80 - Piotr Przymus, Mikolaj Fejzer, Jakub Narebski, Radoslaw Wozniak, Lukasz Halada, Aleksander Kazecki, Mykhailo Molchanov, Krzysztof Stencel:

HaPy-Bug - Human Annotated Python Bug Resolution Dataset. 81-85 - Ahmed Adnan

, Antu Saha, Oscar Chaparro:
SPRINT: An Assistant for Issue Report Management. 86-90 - Piotr Przymus, Thomas Durieux:

Wolves in the Repository: A Software Engineering Analysis of the XZ Utils Supply Chain Attack. 91-102 - Sabato Nocera, Sira Vegas, Giuseppe Scanniello, Natalia Juristo:

Software Composition Analysis and Supply Chain Security in Apache Projects: an Empirical Study. 103-115 - Anas El Hounsri, Daniel Garijo:

Good practice versus reality: A landscape analysis of Research Software metadata adoption in European Open Science Clusters. 116-128 - Zhuang Liu, Xing Hu, Jiayuan Zhou, Xin Xia:

From Industrial Practices to Academia: Uncovering the Gap in Vulnerability Research and Practice. 129-141 - Gunnar Kudrjavets:

Patch Me If You Can - Securing the Linux Kernel. 142-143 - Mahmoud Jahanshahi, David Reid, Adam McDaniel, Audris Mockus:

OSS License Identification at Scale: A Comprehensive Dataset Using World of Code. 144-148 - Chavhan Sujeet Yashavant, MitrajSinh Chavda, Saurabh Kumar

, Amey Karkare, Angshuman Karmakar:
SCRUBD: Smart Contracts Reentrancy and Unhandled Exceptions Vulnerability Dataset. 149-153 - Chaomeng Lu

, Tianyu Li
, Toon Dehaene
, Bert Lagaisse
:
ICVul: A Well-labeled C/C++ Vulnerability Dataset with Comprehensive Metadata and VCCs. 154-158 - Youness Hourri, Alexandre Decan, Tom Mens:

A Dataset of Contributor Activities in the NumFocus Open-Source Community. 159-163 - Luis Soeiro, Thomas Robert, Stefano Zacchiroli:

Wild SBOMs: a Large-scale Dataset of Software Bills of Materials from Public Code. 164-168 - Bikash Saha, Nanda Rani, Sandeep Kumar Shukla:

MaLAware: Automating the Comprehension of Malicious Software Behaviours using Large Language Models (LLMs). 169-173 - Imen Jaoua, Oussama Ben Sghaier, Houari A. Sahraoui:

Combining Large Language Models with Static Analyzers for Code Review Generation. 174-186 - Oussama Ben Sghaier, Martin Weyssow, Houari A. Sahraoui:

Harnessing Large Language Models for Curated Code Reviews. 187-198 - Nafisa Ahmed, Hin Chi Kwok, Mohammad Hamdaqa, Wesley K. G. Assunção:

SMATCH-M-LLM: Semantic Similarity in Metamodel Matching With Large Language Models. 199-210 - Nathalia Nascimento, Everton Guimarães, Sai Sanjna Chintakunta, Santhosh Anitha Boominathan:

How Effective are LLMs for Data Science Coding? A Controlled Experiment. 211-222 - Daniele Bifolco

, Pietro Cassieri, Giuseppe Scanniello, Massimiliano Di Penta, Fiorella Zampetti:
Do LLMs Provide Links to Code Similar to What They Generate? A Study with Gemini and Bing CoPilot. 223-235 - Chunhua Liu, Hong Yi Lin, Patanamon Thongtanunam:

Too Noisy To Learn: Enhancing Data Quality for Code Review Comment Generation. 236-248 - Kyi Shin Khant, Hong Yi Lin, Patanamon Thongtanunam:

Should Code Models Learn Pedagogically? A Preliminary Evaluation of Curriculum Learning for Real-World Software Engineering Tasks. 249-254 - Samuel Abedu, Laurine Menneron, SayedHassan Khatoonabadi, Emad Shihab:

RepoChat: An LLM-Powered Chatbot for GitHub Repository Question-Answering. 255-259 - George Lake

, Minhaz F. Zibran:
Analyzing Dependency Clusters and Security Risks in the Maven Central Repository. 260-264 - Md. Fazle Rabbi

, Arifa Islam Champa, Rajshakhar Paul, Minhaz F. Zibran:
Chasing the Clock: How Fast Are Vulnerabilities Fixed in the Maven Ecosystem? 265-269 - Costain Nachuma, Md Mosharaf Hossan, Asif Kamal Turzo, Minhaz F. Zibran:

Decoding Dependency Risks: A Quantitative Study of Vulnerabilities in the Maven Ecosystem. 270-274 - Md Shafiullah Shafin, Md. Fazle Rabbi

, S. M. Mahedy Hasan, Minhaz F. Zibran:
Faster Releases, Fewer Risks: A Study on Maven Artifact Vulnerabilities and Lifecycle Management. 275-279 - Barisha Chowdhury, Md. Fazle Rabbi

, S. M. Mahedy Hasan, Minhaz F. Zibran:
Insights into Dependency Maintenance Trends in the Maven Ecosystem. 280-284 - Courtney Bodily, Eric Hill, Andreas Kramer, Leslie Kerby, Minhaz F. Zibran:

Insights into Vulnerability Trends in Maven Artifacts: Recurrence, Popularity, and User Behavior. 285-289 - Md. Fazle Rabbi

, Rajshakhar Paul, Arifa Islam Champa, Minhaz F. Zibran:
Understanding Software Vulnerabilities in the Maven Ecosystem: Patterns, Timelines, and Risks. 290-294 - Baltasar Berretta, Augustus Thomas, Heather Guarnera:

Dependency Update Adoption Patterns in the Maven Software Ecosystem. 295-299 - Taha Draoui, Faten Jebari

, Chawki Ben Slimen, Munjaap Uppal, Mohamed Wiem Mkaouer:
Analyzing Vulnerability Overestimation in the Maven Ecosystem. 300-303 - Mehedi Hasan Shanto, Muhammad Asaduzzaman, Manishankar Mondal, Shaiful Alam Chowdhury:

Dependency Dilemmas: A Comparative Study of Independent and Dependent Artifacts in Maven Central Ecosystem. 304-308 - Mina Shehata, Saidmakhmud Makhkamjonoov, Mahad Syed, Esteban Parra:

Cascading Effects: Analyzing Project Failure Impact in the Maven Central Ecosystem. 309-313 - Haruhiko Yoshioka

, Sila Lertbanjongngam, Masayuki Inaba, Youmei Fan
, Takashi Nakano, Kazumasa Shimari, Raula Gaikovina Kula, Kenichi Matsumoto:
Do Developers Depend on Deprecated Library Versions? A Mining Study of Log4j. 314-318 - Hidetake Tanaka

, Kazuma Yamasaki, Momoka Hirose, Takashi Nakano, Youmei Fan
, Kazumasa Shimari, Raula Gaikovina Kula, Kenichi Matsumoto:
Mining for Lags in Updating Critical Security Threats: A Case Study of Log4j Library. 319-323 - Nabhan Suwanachote, Yagut Shakizada, Yutaro Kashiwa, Bin Lin

, Hajimu Iida:
On the Evolution of Unused Dependencies in Java Project Releases: An Empirical Study. 324-328 - Piotr Przymus, Mikolaj Fejzer, Jakub Narebski, Krzysztof Rykaczewski, Krzysztof Stencel:

Out of Sight, Still at Risk: The Lifecycle of Transitive Vulnerabilities in Maven. 329-333 - Nkiru Ede, Jens Dietrich

, Ulrich Zülicke
:
Popularity and Innovation in Maven Central. 334-338 - Yogya Gamage, Nadia Gonzalez Fernandez, Martin Monperrus, Benoit Baudry:

Software Bills of Materials in Maven Central. 339-343 - Ehtisham Ul Haq, Song Wang, Robert S. Allison:

The Ripple Effect of Vulnerabilities in Maven Central: Prevalence, Propagation, and Mitigation Challenges. 344-348 - Corey Yang-Smith, Ahmad Abdellatif:

Tracing Vulnerabilities in Maven: A Study of CVE lifecycles and Dependency Networks. 349-353 - Kazi Amit Hasan, Jerin Yasmin, Huizi Hao, Yuan Tian, Safwat Hassan, Steven H. H. Ding:

Understanding Abandonment and Slowdown Dynamics in the Maven Ecosystem. 354-358 - Saviour Owolabi, Francesco Rosati, Ahmad Abdellatif, Lorenzo De Carli:

Characterizing Packages for Vulnerability Prediction. 359-363 - Sadman Jashim Sakib

, Muhammad Asaduzzaman, Curtis Bright, Cole Morgan:
Understanding the Popularity of Packages in Maven Ecosystem. 364-368 - Damien Jaime, Joyce El Haddad, Pascal Poizat:

Navigating and Exploring Software Dependency Graphs Using Goblin. 369-371 - Adèle Desmazières, Roberto Di Cosmo, Valentin Lorentz:

50 Years of Programming Language Evolution through the Software Heritage looking glass. 372-383 - Ghazal Sobhani, Israat Haque, Tushar Sharma:

It Works (only) on My Machine: A Study on Reproducibility Smells in Ansible Scripts. 384-395 - Tien Nguyen, Waris Gill, Muhammad Ali Gulzar:

Are the Majority of Public Computational Notebooks Pathologically Non-Executable? 396-407 - Suraj Bhatta, Frank Kendemah, Ajay Kumar Jha:

Understanding Test Deletion in Java Applications. 408-420 - Alix Decrop

, Sara Eraso, Xavier Devroey, Gilles Perrouin
:
A Public Benchmark of REST APIs. 421-433 - Bruna Falcucci, Felipe Gomide, André C. Hora:

What Do Contribution Guidelines Say About Software Testing? 434-438 - Chamindra de Silva, Daniel Izquierdo-Cortazar:

Measuring InnerSource Value. 439-440 - Kaihang Jiang, Bihui Jin, Pengyu Nie:

CoUpJava: A Dataset of Code Upgrade Histories in Open-Source Java Repositories. 441-445 - Ilham A. Qasse, Mohammad Hamdaqa, Björn Þór Jónsson:

EvoChain: A Framework for Tracking and Visualizing Smart Contract Evolution. 446-450 - Kunal Suresh Pai, Premkumar T. Devanbu, Toufique Ahmed:

CoDocBench: A Dataset for Code-Documentation Alignment in Software Maintenance. 451-455 - Vahid Haratian, Pouria Derakhshanfar, Vladimir Kovalenko, Eray Tüzün

:
RefExpo: Unveiling Software Project Structures through Advanced Dependency Graph Extraction. 456-460 - Quentin Le Dilavrec, Andy Zaidman:

HyperAST: Incrementally Mining Large Source Code Repositories. 461-464 - Fabio Salerno

, Ali Al-Kaswan, Maliheh Izadi:
How Much Do Code Language Models Remember? An Investigation on Data Extraction Attacks Before and After Fine-tuning. 465-477 - Mohammad Talal Jamil, Shamsa Abid, Shafay Shamail:

Can LLMs Generate Higher Quality Code Than Humans? An Empirical Study. 478-489 - Jiho Shin

, Clark Tang, Tahmineh Mohati, Maleknaz Nayebi, Song Wang, Hadi Hemmati:
Prompt Engineering or Fine-Tuning: An Empirical Assessment of LLMs for Code. 490-502 - Timur Galimzyanov, Sergey Titov, Yaroslav Golubev

, Egor Bogomolov:
Drawing Pandas: A Benchmark for LLMs in Generating Plotting Code. 503-507 - Daniel Rodríguez-Cárdenas, Alejandro Velasco, Denys Poshyvanyk:

SnipGen: A Mining Repository Framework for Evaluating LLMs for Code. 508-512 - Andrei Bogdan, Mauricio Verano Merino

, Ivano Malavolta
:
The Ecosystem of Open-Source Music Production Software - A Mining Study on the Development Practices of VST Plugins on GitHub. 513-525 - Toufique Ahmed, Premkumar T. Devanbu, Christoph Treude, Michael Pradel:

Can LLMs Replace Manual Annotation of Software Engineering Artifacts? 526-538 - Md Shamimur Rahman

, Zadia Codabux, Chanchal K. Roy:
Investigating the Understandability of Review Comments on Code Change Requests. 539-551 - Matteo Vaccargiu, Sabrina Aufiero, C. Ba, Silvia Bartolucci, Richard G. Clegg, Daniel Graziotin

, Rumyana Neykova, Roberto Tonelli, Giuseppe Destefanis:
Mining a Decade of Event Impacts on Contributor Dynamics in Ethereum: A Longitudinal Study. 552-563 - Emanuela Guglielmi

, Gabriele Bavota, Nicole Novielli, Rocco Oliveto, Simone Scalabrino:
Is it Really Fun? Detecting Low Engagement Events in Video Games. 564-575 - Rio Kishimoto, Tetsuya Kanda, Yuki Manabe, Katsuro Inoue, Shi Qiu, Yoshiki Higo:

A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools. 576-580 - Tomoki Nakamaru, Tomomasa Matsunaga, Tetsuro Yamazaki:

Jupyter Notebook Activity Dataset. 581-585 - Vivek Sarkar, Anemone Kampkötter, Ben Hermann:

CoPhi - Mining C/C++ Packages for Conan Ecosystem Analysis. 586-590 - Johannes Düsing, Jared Chiaramonte, Ben Hermann:

MARIN: A Research-Centric Interface for Querying Software Artifacts on Maven Repositories. 591-595 - Nicolas Hlad, Benoît Verhaeghe, Kilian Bauvent:

GitProjectHealth: an Extensible Framework for Git Social Platform Mining. 596-600 - Benoit Baudry, Erik Natanael Gustafsson, Roni Kaufman, Maria Kling:

MYRIAD PEOPLE Open Source Software for New Media Arts. 601-605 - Erfan Raoofian, Fatemeh H. Fard, Ifeoma Adaji, Gema Rodríguez-Pérez:

OpenMent: A Dataset of Mentor-Mentee Interactions in Google Summer of Code. 606-610 - Kalvin Eng, Abram Hindle:

Under the Blueprints: Parsing Unreal Engine's Visual Scripting at Scale. 611-615 - Anwar Ghammam, Dhia Elhaq Rzig, Mohamed Almukhtar, Rania Khalsi, Foyzul Hassan, Marouane Kessentini:

Build Code Needs Maintenance Too: A Study on Refactoring and Technical Debt in Build Systems. 616-628 - Ziyang Ye

, Triet Huynh Minh Le, M. Ali Babar:
LLMSecConfig: An LLM-Based Approach for Fixing Software Container Misconfigurations. 629-641 - Mahi Begoug, Ali Ouni, Moataz Chouchen:

How Do Infrastructure-as-Code Practitioners Update Their Dependencies? An Empirical Study on Terraform Module Updates. 642-653 - Christoph Bühler

, David Spielmann, Roland Meier, Guido Salvaneschi:
TerraDS: A Dataset for Terraform HCL Programs. 654-658 - Zhuoran Tan, Christos Anagnostopoulos, Jeremy Singer:

OSPtrack: A Labeled Dataset Targeting Simulated Execution of Open-Source Software. 659-663 - Euxane Tran-Girard

, Laurent Bulteau, Pierre-Yves David:
CARDS: A collection of package, revision, and miscellaneous dependency graphs. 664-668 - Florent Moriconi, Thomas Durieux, Jean-Rémy Falleri, Raphaël Troncy, Aurélien Francillon:

GHALogs: Large-Scale Dataset of GitHub Actions Runs. 669-673 - Navid Bin Hasan

, Md. Ashraful Islam, Junaed Younus Khan, Sanjida Senjik, Anindya Iqbal:
Automatic High-Level Test Case Generation using Large Language Models. 674-685 - Mahan Tafreshipour, Aaron Imani, Eric Huang, Eduardo Santana de Almeida, Thomas Zimmermann, Iftekhar Ahmed:

Prompting in the Wild: An Empirical Study of Prompt Evolution in Software Repositories. 686-698 - Ramtin Ehsani, Sakshi Pathak, Preetha Chatterjee

:
Towards Detecting Prompt Knowledge Gaps for Improved LLM-guided Issue Resolution. 699-711 - Ahmad J. Tayeb

, Sonia Haiduc
:
Intelligent Semantic Matching (ISM) for Video Tutorial Search using Transformer Models. 712-724 - Negar Alizadeh, Boris Belchev, Nishant Saurabh, Patricia Kelbert, Fernando Castor

:
Language Models in Software Development Tasks: An Experimental Analysis of Energy and Accuracy. 725-736 - Anisha Islam, Abram Hindle:

TriGraph: A Probabilistic Subgraph-Based Model for Visual Code Completion in Pure Data. 737-749 - Faiz Ahmed, Xuchen Tan, Folajinmi Adewole, Suprakash Datta, Maleknaz Nayebi:

Inferring Questions from Programming Screenshots. 750-755 - Jirat Pasuksmit, Wannita Takerngsaksiri, Patanamon Thongtanunam, Chakkrit Tantithamthavorn, Ruixiong Zhang, Shiyan Wang, Fan Jiang, Jing Li, Evan Cook, Kun Chen, Ming Wu:

Human-In-The-Loop Software Development Agents: Challenges and Future Directions. 756-757 - Madhurima Chakraborty, Peter Pirkelbauer, Qing Yi:

FormalSpecCpp: A Dataset of C++ Formal Specifications created using LLMs. 758-762 - Karthik Shivashankar, Antonio Martini:

PyExamine: A Comprehensive, Un-Opinionated Smell Detection Tool for Python. 763-774 - Julien Malka, Stefano Zacchiroli, Théo Zimmermann:

Does Functional Package Management Enable Reproducible Builds at Scale? Yes. 775-787 - Emna Ksontini, Meriem Mastouri, Rania Khalsi, Wael Kessentini:

Refactoring for Dockerfile Quality: A Dive into Developer Practices and Automation Potential. 788-800 - Seif Kosbar, Mohammad Hamdaqa:

Smells-sus: Sustainability Smells in IaC. 801-812 - Shaiful Alam Chowdhury, Hisham Kidwai, Muhammad Asaduzzaman:

Evidence is All We Need: Do Self-Admitted Technical Debts Impact Method-Level Maintenance? 813-825 - Aryan Boloori, Tushar Sharma:

DPy: Code Smells Detection Tool for Python. 826-830 - Mouna Dhaouadi, Bentley Oakes, Michalis Famelis:

CoMRAT: Commit Message Rationale Analysis Tool. 831-835 - Sergio Di Meglio, Luigi Libero Lucio Starace, Valeria Pontillo

, Ruben Opdebeeck
, Coen De Roover
, Sergio Di Martino:
E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects. 836-840 - Altino Alves

, André C. Hora:
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest. 841-845 - Idriss Abdelmadjid, Robert Dyer

:
pyMethods2Test: A Dataset of Python Tests Mapped to Focal Methods. 846-850 - Mengzhen Li, Mattia Fazzini

:
DataTD: A Dataset of Java Projects Including Test Doubles. 851-855 - Kaveh Shahedi

, Maxime Lamothe, Foutse Khomh, Heng Li:
JPerfEvo: A Tool for Tracking Method-Level Performance Changes in Java Projects. 856-860

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














