MLCommons Science Working Group AI Benchmarks Collection
This site curates a collection of AI benchmarks.
The main artifact is the PDF report:
- Report (PDF): benchmarks.pdf
If you cite this work, please reference the PDF (not the Markdown pages). A BibTeX entry is provided below.
@misc{www-las-mlcommons-benchmark-coolection,
author = {
Gregor von Laszewski and
Ben Hawks and
Marco Colombo and
Reece Shiraishi and
Anjay Krishnan and
Nhan Tran and
Geoffrey C. Fox},
title = {MLCommons Science Working Group AI Benchmarks Collection},
url = {https://mlcommons-science.github.io/benchmark/benchmarks.pdf}
note = "Online Collection: \url={https://mlcommons-science.github.io/benchmark/}",
month = jun,
year = 2025,
howpublished = "GitHub"
}
For online browsing, we provide three views (each entry links to its detailed page):
- Cards view: richest UI with advanced filtering, tag-based quick filters, and interactive sorting controls.
- Table view: compact table where you can toggle visible columns and download the data as CSV or JSON.
- List view: straightforward alphabetical list of benchmark names.
Note: The Markdown pages are generated for web browsing and should not be cited. Please cite the PDF report above.
All pages are generated automatically; please don’t edit them directly. To propose changes, contributions, or corrections, follow the guidelines in the project repository: https://github.com/mlcommons-science/benchmark.
For program-level improvements, contact Gregor von Laszewski at laszewski at gmail.com.