publications

My academic publications

2026

2026

  1. arXiv
    Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies
    Miles Brundage, Noemi Dreksler, Aidan Homewood, Sean McGregor, Patricia Paskov, Conrad Stosz, Girish Sastry, A. Feder Cooper, George Balston, Steven Adler, Stephen Casper, Markus Anderljung, Grace Werner, Soren Mindermann, Vasilios Mavroudis, Ben Bucknall, Charlotte Stix, Jonas Freund, Lorenzo Pacchiardi, Jose Hernandez-Orallo, Matteo Pistillo, Michael Chen, Chris Painter, Dean W. Ball, Cullen O’Keefe, Gabriel Weil, Ben Harack, Graeme Finley, Ryan Hassan, Scott Emmons, Charles Foster, Anka Reuel, Bri Treece, Yoshua Bengio, Daniel Reti, Rishi Bommasani, Cristian Trout, Ali Shahin Shamsabadi, Rajiv Dattani, Adrian Weller, Robert Trager, Jaime Sevilla, Lauren Wagner, Lisa Soder, Ketan Ramakrishnan, Henry Papadatos, Malcolm Murray, and Ryan Tovcimak
    2026

2025

2025

  1. NeurIPS Workshop
    A Framework for the Categorisation of General-Purpose AI Models under the EU AI Act
    Lorenzo Pacchiardi, John Burden, Fernando Martı́nez-Plumed, Jose Hernandez-Orallo, Emilia Gomez, and David Fernández-Llorca
    In NeurIPS 2025 Workshop on Regulatable ML , 2025
  2. arXiv
    No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes
    Iván Vicente Moreno Cencerrado, Arnau Padrés Masdemont, Anton Gonzalvez Hawthorne, David Demitri Africa, and Lorenzo Pacchiardi
    arXiv preprint arXiv:2509.10625, 2025
  3. TMLR
    Measuring Data Science Automation: A Survey of Evaluation Tools for AI Assistants and Agents
    Irene Testini*, José Hernández-Orallo, and Lorenzo Pacchiardi*
    Transactions on Machine Learning Research, 2025
  4. arXiv
    General Scales Unlock AI Evaluation with Explanatory and Predictive Power
    Lexin Zhou, Lorenzo Pacchiardi, Fernando Martínez-Plumed, Katherine M. Collins, Yael Moros-Daval, Seraphina Zhang, Qinlin Zhao, Yitian Huang, Luning Sun, Jonathan E. Prunty, Zongqian Li, Pablo Sánchez-García, Kexin Jiang Chen, Pablo A. M. Casares, Jiyun Zu, John Burden, Behzad Mehrbakhsh, David Stillwell, Manuel Cebrian, Jindong Wang, Peter Henderson, Sherry Tongshuang Wu, Patrick C. Kyllonen, Lucy Cheke, Xing Xie, and José Hernández-Orallo
    arXiv preprint arXiv:2503.06378, 2025
  5. ACL Findings
    PredictaBoard: Benchmarking LLM Score Predictability
    Lorenzo Pacchiardi, Konstantinos Voudouris, Ben Slater, Fernando Martínez-Plumed, José Hernández-Orallo, Lexin Zhou, and Wout Schellaert
    Findings of the Association for Computational Linguistics: ACL 2025, 2025
  6. IJCAI
    Paradigms of AI Evaluation: Mapping Goals, Methodologies and Culture
    John Burden*, Marko Tešić*, Lorenzo Pacchiardi*, and José Hernández-Orallo
    IJCAI 2025 Survey Track, 2025

2024

2024

  1. arXiv
    Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answers
    Lorenzo Pacchiardi*, Marko Tesic*, Lucy G. Cheke, and José Hernández-Orallo
    arXiv preprint arXiv:2410.11672, 2024
  2. 100 instances is all you need: predicting the success of a new LLM on unseen data by testing on a few instances
    Lorenzo Pacchiardi, Lucy G. Cheke, and José Hernández-Orallo
    2024 KDD workshop on Evaluation and Trustworthiness of Generative AI Models, 2024
  3. ICLR 2024
    How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions
    Lorenzo Pacchiardi*, Alex J Chan*, Sören Mindermann, Ilan Moscovitz, Alexa Y Pan, Yarin Gal, Owain Evans, and Jan Brauner
    The Twelfth International Conference on Learning Representations, 2024
  4. JMLR
    Probabilistic Forecasting with Generative Networks via Scoring Rule Minimization
    Lorenzo Pacchiardi, Rilwan Adewoyin, Peter Dueben, and Ritabrata Dutta
    Journal of Machine Learning Research, 2024
  5. EJS
    Generalized Bayesian likelihood-free inference
    Lorenzo Pacchiardi, Sherman Khoo, and Ritabrata Dutta
    Electronic Journal of Statistics, 2024

2022

2022

  1. arXiv
    Likelihood-Free Inference with Generative Neural Networks via Scoring Rule Minimization
    Lorenzo Pacchiardi, and Ritabrata Dutta
    arXiv preprint arXiv:2205.15784, 2022
  2. JMLR
    Score Matched Neural Exponential Families for Likelihood-Free Inference
    Lorenzo Pacchiardi, and Ritabrata Dutta
    Journal of Machine Learning Research, 2022

2021

2021

  1. PLOS Comp. Biol.
    Using Mobility Data in the Design of Optimal Lockdown Strategies for the COVID-19 Pandemic
    Ritabrata Dutta, Susana Gomes, Dante Kalise, and Lorenzo Pacchiardi
    PLOS Computational Biology, 2021
  2. JSS
    ABCpy: A High-Performance Computing Perspective to Approximate Bayesian Computation
    Ritabrata Dutta, Marcel Schoengens, Lorenzo Pacchiardi, Avinash Ummadisingu, Nicole Widmer, Pierre Künzli, Jukka-Pekka Onnela, and Antonietta Mira
    Journal of Statistical Software, 2021

2020

2020

  1. Sankhya B
    Distance-Learning for Approximate Bayesian Computation to Model a Volcanic Eruption
    Lorenzo Pacchiardi, Pierre Künzli, Marcel Schoengens, Bastien Chopard, and Ritabrata Dutta
    Sankhya B, 2020