Skip to content

MLCommons Science Working Group AI Benchmarks Collection

This site curates a collection of AI benchmarks.
The main artifact is the PDF report:

If you cite this work, please reference the PDF (not the Markdown pages). A BibTeX entry is provided below.

@misc{www-las-mlcommons-benchmark-coolection,
  author = {
    Gregor von Laszewski and 
    Ben Hawks and 
    Marco Colombo and
    Reece Shiraishi and
    Anjay Krishnan and
    Nhan Tran and
    Geoffrey C. Fox},
  title = {MLCommons Science Working Group AI Benchmarks Collection},
  url = {https://mlcommons-science.github.io/benchmark/benchmarks.pdf}
  note = "Online Collection: \url={https://mlcommons-science.github.io/benchmark/}",
  month = jun,
  year = 2025,
  howpublished = "GitHub"
}

For online browsing, we provide three views (each entry links to its detailed page):

  • Cards view: richest UI with advanced filtering, tag-based quick filters, and interactive sorting controls.
  • Table view: compact table where you can toggle visible columns and download the data as CSV or JSON.
  • List view: straightforward alphabetical list of benchmark names.

Note: The Markdown pages are generated for web browsing and should not be cited. Please cite the PDF report above.

All pages are generated automatically; please don’t edit them directly. To propose changes, contributions, or corrections, follow the guidelines in the project repository: https://github.com/mlcommons-science/benchmark.

For program-level improvements, contact Gregor von Laszewski at laszewski at gmail.com.