2021 Summary

The year 2021 marked the foundational period for the MLCommons Science Working Group, focusing on the establishment of goals, the identification of initial scientific benchmarks, and the integration of the group's activities within the broader MLCommons ecosystem.

Key Initiatives

Defining Scientific Benchmarking

The group focused on distinguishing scientific ML benchmarks from traditional MLPerf benchmarks. A key realization was that for science, the final scientific output (accuracy and scientific progress) is often more critical than the convergence metrics typically tracked in general ML training.

Initial Benchmark Development

Early efforts were directed toward several key scientific domains: - TevolOP and STEMDL/EDiff: These served as primary examples of scientific benchmarks with clear goals. - Cryo-EM: Exploration of cryo-electron microscopy datasets as a potential area for future benchmarking. - SciML Benchmarks: Initial results and status updates were presented to establish a baseline for performance.

Adoption of FAIR Principles

A significant portion of the year was dedicated to ensuring that scientific benchmarks adhere to the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. This included: - Metadata and Ontology: Discussions on using Schema.org and developing a notion of metadata that is machine-readable. - Standardized Logging: Efforts to adapt MLPerf logging standards to be more streamlined for science, focusing on science-specific metrics and system metadata.

Infrastructure and Governance

System Access and Execution

The group worked to secure access to critical HPC resources, including the PEARL system and various ANL systems, to ensure that benchmarks could be executed on representative hardware.

MLCommons Integration

The group navigated the administrative requirements of MLCommons, including: - Lab Membership: Evaluating membership levels and voting privileges. - Legal Frameworks: Pursuing Contributor License Agreements (CLAs) and Memorandums of Understanding (MoUs) to facilitate the submission of code and benchmarks.

Summary of Goals

By the end of 2021, the group had established a clear direction: to create a set of science benchmarks that are not only performance-oriented but also scientifically meaningful and strictly adherent to open data and metadata standards.