March 9, 2022

Present

Tony Hey, Jeyan Thiyagalingam, Geoffrey Fox, Juri Papay, Gregg Barrett, Farzana Yasmin Ahmad, Aristeidis Tsaris, Murali Emani, Piotr Luszczek
Apologies: Arjun Shankar, Christine Kirkpatrick, Gregor von Laszewski,

Tentative Agenda

New member introductions
Update on possible new benchmarks by Geoffrey
Further comments on FAIR metadata (not done)
Continuation of discussion of portability of benchmarks. How much effort is involved in deploying a benchmark?
Clarification on what each benchmark is going to measure.
AOB (Performance and Portability)

Update on possible new benchmarks

Geoffrey went through new benchmarks that could be added to our collection. Science WG of MLCommons March 9 2022.

LLNL Lawrence Livermore
FastML
DOE-SBI project (Biophysics, Performance, Nano Engineering)
RAL Projects

The Livermore surrogate ICF should be available soon.

Current Benchmarks

We agreed to complete our 4 Benchmarks by May 22, 2022to announce at ISC BOF. ISC2022 is held at Frankfurt May 29 - June 2, 2022 BOF Sessions - Welcome to ISC High Performance 2022. We need to add logging and deposit data and reference models in the MLCommons repository. We agreed to complete by May 2022 and also complete the paper MLCommons Science Benchmarks
Murali said the ANL benchmark was ready with a science metric.

Performance and Portability

Juri presented some initial results around moving the STEM-DL benchmark from Horovod-based to Pytorch lightening. The performance of the Pytorch Lightening around a single GPU became a concern (which is far too slower than Horovod-based implementation). Jeyan / Juri proposed to investigate this issue further,
There was a discussion of various deep learning communication libraries: NCCL, GLOO, MPI, Horovod