June 12, 2024
June 12, 2024
Present
Geoffrey Fox, Juri Papay, Gregor von Laszewski, Gregg Barrett, Christine Kirkpatrick, Wes Brewer, Piotr Luszczek, Sujata Goswami, Victor Lu, Steve Farrell, Hector Hernandez Corzo, Javier Toledo, Sharma Lee, Shreeya Singh Dhakal
Apologies
Tom Gibbs, Jeyan Thiyagalingam
Tentative Agenda
- Any New Members Introduction
- Status of Papers
- Status of Benchmarks
- Science Foundation Models
- Any Other Business
New Members
- Sujata Goswami is at Oak Ridge National Laboratory https://www.ornl.gov/staff-profile/sujata-goswami https://www.linkedin.com/in/sujata-goswami/ See May 15 Introduction. Works on datasets, automatic provenance, and metadata anomalies.
-
Javier Quetzalcoatl Toledo-Marin, Quantum Machine Learning Research Associate at TRIUMF (Canada's Particle Accelerator Centre), will consult on AI/ML development. He has experience in developing AI/ML surrogates for diffusion equations in multicellular models. He is currently developing a generative AI variational auto-encoder using quantum computer acceleration for the Kaggle calorimeter challenge.
-
Sharma Lee is at Naval Research Lab, Laboratory for Computational Physics & Fluid Dynamics with benchmarking, HPC systems and computational fluid dynamics expertise.
- Shreeya Singh Dhakal is an Applied Scientist at DocuSign in Seattle and founder of Nepali Women in Computing. Degree from North Carolina State University. https://www.linkedin.com/in/shreeyya/
Benchmarks
- Geoffrey gave an update on Earthquake benchmark Time Series_for_Earthquake_Nowcasting.pdf where many (17) different time series models are compared for a variant of the MLCommons Science benchmark. We will try to add RWKV-TS discussed in the talk at last meeting by Hernandez.
- Wes noted A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs on large scale training
- The best way to operate large scale facilities is covered by EE HPC WG: ODA Team; Gregg noted their operational data analytics team.
Any Other Business
- Christine will look at paper in Overleaf and add paper
- We discussed including results from Digital twin for the Frontier replacement ExaDigiT at Oakridge, which has access to telemetry and an AR interface.
- There is an SC24 paper with AR interface and a study of energ
- Juri and Geoffrey talked to David Kanter the following day Jun 13, 2024. We described the value of the working group and asked about interactions with the AI Alliance
- Gregor and Wes discussed OSMI CFD benchmark and the tools needed to run this benchmark
- https://www.linkedin.com/posts/arjunsuresh_github-mlcommonsck-mlcommons-cm-is-a-activity-6996949976035516416-rDdX/ describes CM framework with of its architects Arjun Suresh. Grigori Fursin has worked a lot on this and Inference tasks.
- HPE’s Smartsim discussed; HPE software released with better nvidia drivers
- Gregor’s Cloudmesh system
- Need to loop over values of hyperparameters and configurations