Sebastian Schuster: Evaluating Coding Agents for Data Science and Machine Learning Research

13.10.2025 13:30

Our Data Science Talk on 13 October 2025 features Sebastian Schuster, Assistant Professor in Natural Language Processing, from the Research Group Data Mining and Machine Learning at the Faculty of Computer Science

Monday, 13 October 2025 @ 13:30–14:30 CEST

On-site:

University of Vienna
Seminarraum 8 (OG01)
Kolingasse 14–16

1090 Vienna

Online:

https://univienna.zoom.us/j/67032386717?pwd=g8HOG2oRrWK6T5cvmRA7bv17QRzq72.1

Meeting ID: 670 3238 6717
Passcode: 440328

 

Evaluating Coding Agents for Data Science and Machine Learning Research

 

Abstract
:

Agents based on Large Language Models (LLMs) have shown promise for performing sophisticated software engineering tasks autonomously. In addition, there has been progress in developing agents that can perform parts of the research pipeline in data science, machine learning, and the natural sciences. However, the ability of these agents to reliably produce code that yields accurate research results has not yet been adequately assessed.

In this talk, I will introduce a new benchmark called "REXBench" that evaluates the ability of LLM-based coding agents to autonomously implement novel research extensions. I will argue that research extensions are an ideal testing ground for evaluating such agents and explain how our benchmark circumvents common data contamination issues. I will also present results from evaluating nine recent LLM-based agents and discuss their implications for using LLM agents to write research code.

Bio
:

Sebastian Schuster is an assistant professor at the Faculty of Computer Science at the University of Vienna, where he heads a WWTF-funded Vienna Research Group focused on natural language processing. His research focuses on evaluating large language models, developing sophisticated natural language understanding models, and using machine learning models to uncover processes involved in human language processing. Before returning to Vienna this year, he was an assistant professor at University College London and a postdoc at New York University and Saarland University. He holds a  PhD in computational linguistics and an MS in computer science from Stanford University, and a BSc in computer science from the University of Vienna.