May 31, 2023
May 31, 2023
Present
Gregg Barrett, Geoffrey Fox, Juri Papay, Wesley Brewer, Gregor von Laszewski, Piotr Luszczek, Christine Kirkpatrick, Aristeidis Tsaris, Tom Gibbs,
Tentative Agenda
- Any new members
- Mailing Lists for the announcement of benchmarks
- Status of new Benchmarks
- Using Benchmarking Data to Inform Decisions Related to Machine Learning Resource Efficiency (Continued) mlcommons_data_energy_usage_paper
- Benchmark Carpentry benchmark-carpentry
- AI Readiness of MLCommons Science (Continued) MLCommons Science FAIR Concept Paper
- AOB
We’re Live
- We discussed distribution of announcements
- Christine is submitting the story about the benchmarks here to NCSA ACCESS: Share Your News - Access
- SDSC MLCommons Science Working Group Invites Researchers to Run New Benchmarks
- This was also pushed by SDSC to HPCwire MLCommons Science Working Group Invites Researchers to Run New Benchmarks
Progress in UK
- Juri discussed a new book he is proposing on benchmarks with a human focus
- There was a recent benchmarking meeting in Leicester
- SciML at RAL released a new benchmark set
- Will interview contributors to the field
Foundation Models
- Geoffrey suggested generalizing benchmarks to Foundation models.as they have some similarities as representative exemplars for a field
- His ideas are sketched in FoundationModelsPatternsBenchmarks.pdf
- Can we make this a goal of group. Note our second NSF proposal was turned down. Geoffrey thinks that a Foundation model focus would be easier to write an exciting proposal around.
- Networks such as UNET, LSTM and Transformers are applicable across many fields. We can at the simplest have a library of patterns that are tweaked and composed across many fields
- Tom commented that we need to replace Linpack but it is a single example. Its much worse for partial differential equation solvers for which there is no universal (Foundation) model and it is very painful to support.
- Piotr said one could average over a set of codes
- Tom noted that AI not well represented in current DOE portfolio
- Tom is working with Bill Tang of Princeton Plasmas Physics lab on a Fusion Digital Twin.
- He also noted that Genomics analysis starts with a Language trained GPT3 with only problem the size of the input sequence which can be 64000 for genomics (There was a talk from Argonne at ISC on this)
- It was agreed that Foundation models could be an attractive funding focus
- Foundation models for subdomains were mentioned
- Discuss these ideas with Mike Norman, Dan Stanzione and Manish Parashar.
- It was noted that Mike Norman has a new Cosmology code that could be a benchmark
Education and Workforce Development
- Christine discussed a recent NSF Dear Colleague letter Dear Colleague Letter: Request for Information on the Capacity of Institutions of Higher Education to Produce Graduates with Degrees, Certifications, and Relevant Skills Related to Artificial Intelligence (nsf23099) | NSF?
- She suggested that we respond to this letter