March 24, 2021

Present:

Geoffrey Fox, Aristeidis Tsaris, Christine Kirkpatrick, Gregg Barrett, Amit Ruhela, Junqi Yin, Arjun Shankar, Venkat Vishwanath,Tony Hey, Vibhatha Abeyakoon, Bala Desinghu, Gregor von Laszewski, Murali Emani, Jeyan Thiyagalingam, Poonan Yadav, Mohammad Lataifeh, Juri Papay, David Kanter,

Agenda

Any new member introductions
Benchmark and multi-host execution Updates
Presentation "How UK Exascale community sees benchmarking” by Jeyan Thiyagalingam, Exascale_Benchmarking_UK.pptx
Any other business

Welcome to new members

Poonam Yadav from the CS Department, University of York, UK, joined the group for the first time.
Mohammad Lataifeh from the CS Department of the University of Sharjah also joined the group meeting today but initially had some sound technical problems. This was later fixed and he reported in at the end of the meeting.
Juri Papay from the RAL SciML group introduced himself and had now access to the benchmarks and the RAL PEARL DGX-2 systems. He expected to report on significant progress at the next meeting.

Updates

Junqi (ORNL) and Murali (ANL) MLPerf-Science-CANDLE.pdf gave updates of their work with the EMDIFF/STEM DL and UNO (TF2 version) benchmarks.
Both benchmarks had now been run on the ThetaGPU system at ANL with 24 NVIDIA DGX A100 nodes as well as on the Theta KNL machine and SUMMIT at ORNL.
They displayed some results but more work is needed from the group about how and what results should be presented.
Jeyan (RAL) reported on the SciML Cloud benchmark. The SciML framework had been significantly updated and this had delayed the planned formal release of the SciML Benchmark suite – which includes the Cloud benchmark. He and Juri would give a full report at the next meeting.
Geoffrey Fox reported on progress with the Indiana Earthquake time series benchmark. He and Gregor were now in a position to run the benchmark on the RAL DGX2 system.

Discussion

Venkat (ANL) made the suggestion that the benchmark owners should summarize what outputs each benchmark produced and classify these in terms of system performance and science. The goal of this group is to produce more than just raw performance numbers but produce useful results for scientist users.
David Kanter, the Executive Director of MLCommons, had now joined the meeting and said a few words. He had seen our first meeting with just Geoffrey, Jeyan and Tony and he was pleased to see that progress was being made. Tony summarized where we were with our 4 initial benchmarks and said that we were following the example of the HPC WG. The plan was to have a session at SC21 on the results of the Science WG benchmarks. He promised that the group would follow MLCommons procedures before publication of any results!

Presentation from Jeyan on UK EXCALIBUR EXASCALE initiative

Slides available Exascale_Benchmarking_UK.pptx This fascinating talk led to a discussion of available hardware platforms including Argonne’s rich collection ALCF AI-Testbed and a Graphcore at RAL

Any Other Business

Bala Desinghu discussed the use of MLCube on XSEDE systems.

Action Items

ANL, ORNL, RAL, Indiana: Benchmark holders prepare a written summary of their benchmark and of the expected outputs from both a science (such as Validation Loss with UNO) and technical (such as throughput versus batchsize) point of view. There is a shared Google doc to share this information Benchmark Outputs