Skip to content

February 23, 2022

February 23, 2022

Present

Tony Hey, Geoffrey Fox, Gregor von Laszewski, Juri Papay, Gregg Barrett, Farzana Yasmin Ahmad, Arjun Shankar, Aristeidis Tsaris, Junqi Yin, Murali Emani, Cade Brown, Piotr Luszczek
Apologies: Jeyan Thiyagalingam, Christine Kirkpatrick

Tentative Agenda

  • New member introductions
  • Special Presentation "FAIR Metadata and Launching Platform for Surrogates" by University of Tennessee, Cade Brown and Piotr Luszczek "
  • Continuation of discussion of portability of benchmarks. How much effort is involved in deploying a benchmark?
  • Clarification on what each benchmark is going to measure.(NOT DONE)
  • AOB

New Member Introductions

Cade Brown and Piotr Luszczek from the University of Tennessee introduced themselves. They work in the Innovative Computing Laboratory led by Jack Dongarra at the University of Tennessee. Cade Brownis a Junior at UTK.

Presentation on FAIR Metadata and Launching Platform for Surrogates

Discussion of portability of benchmarks

  • This continued the theme of the previous topic and we agreed that we must specify two separate artifacts
  • FAIR metadata links to performance results, data, and reference models
  • A set of containers supporting execution. These containers will have the same or similar specification files but different instances on each hardware due to version inconsistencies and rules that systems to container supported and systems installed. These containers will support PyTorch, Tensorflow, Horovod, etc.
  • Note Horovod tends to have the best performance due to its choice of MPIAllReduce
  • Gregor noted for example that compiled Python 10% faster than standard at the University of Virginia
  • Porting between containers can be performed on a machine that supports multiple versions (Docker, Singularity, Shifter)
  • Target hardware could be ThetaGPU (Argonne), Summit(ORNL), Pearl (RAL) AWS (supports Kubernetes), TACC/SDSC, NERSC