April 5, 2023
April 5, 2023
Present
Gregg Barrett, Geoffrey Fox, Juri Papay,Mallikarjun Shankar, Wesley Brewer, Aristeidis Tsaris, Tom Gibbs, Gregor von Laszewski, Piotr Luszczek, Christine Kirkpatrick
Apologies
Christian Herwig
Tentative Agenda
- Any new members None
- Positioning of Science Benchmarks in MLCommons Update Web and Policy documents
- AI Readiness of MLCommons Science (Continued) https://docs.google.com/document/d/1NbL-VdkrY9jzPxveOys2RCK8TdEJ7O5wgnxjAgzK-rE/edit?usp=sharing NOT DISCUSSED
- Using Benchmarking Data to Inform Decisions Related to Machine Learning Resource Efficiency (Continued) https://docs.google.com/document/d/1gOKA8BnlJnsTAELWFSmL7Fl7kJej_UrNH-FVXbZFxGI/edit?usp=sharing
- Discussion of new Benchmarks (Continued) NOT DISCUSSED
- AOB
Next Meeting
This will be May 3 as next one scheduled collides with community meeting
Positioning of Science Benchmarks in MLCommons
Juri and Gregor are working on improving our web resources to reflect open division only plan. We further discussed this with David Kanter on April 7 and we hope to complete the document revision before the Community quarterly meeting on April 20.
Use of Benchmarks, Education
- Much of the discussion this week revolved around the use of benchmarks; especially by often inexperienced students. This also led to a discussion of the value of benchmarks in education
- Gregor discussed work with Cloudmask and Cosmoflow (HPC working group)
- Many students are unprepared for practical issues; partly a consequence of courses that stress high-level issues in using Python. Jupyter notebooks are very powerful but encourage poor programming practices. Gregg noted that we need a course on proper Python. Some students have no idea about Unix file systems
- The difference between HPC and commercial setups is non-trivial and the professionals who typically respond to MLPerf challenges are at a different expertise level from many students. Aris agreed with this sentiment.
- Nevertheless MLCommons benchmarks combining state-of-the-art software and algorithms are attractive to the many students getting started in AI
- Tom Gibbs noted https://developer.nvidia.com/modulus NVIDIA Modulus featuring quality exemplars of the popular Physics inspired neural networks. They like MLCommons' examples can be adapted from one application domain to another.
- Difficulties with the transfer of large datasets were discussed with Globus considered the most reliable solution but it had problems in some implementations. No better solution was mentioned.
- Arjun wondered about the role of MLCube and the Best practices working group. This had been very successful for MedPerf https://www.medperf.org/
- Gregg mused that this confirms the obstacles to AI in science are numerous and nontrivial!
- Wes recalled such discussion from the component-based development to service-orientated architecture days. Building stuff without foundation engineering
- Wes noted the course https://missing.csail.mit.edu/ addressing some of these points.
- The concept of Benchmark Carpentry was proposed as important by Gregor and Christine
- Some foundational courseware that covers some gaps. The carpentry.
- And Tom's point on libraries - Nvidia / Hugging Face approach.
- Cover Globus, Timing, Profiling
- See benchmark-carpentry
- TinyML courses with EdX and Coursera
- Ed Seidel’s Wyoming initiatives for broad skills noted
- Some of these points can go in one of our papers
- Perhaps a Cybertraining proposal could be prepared
Using Benchmarking Data to Inform Decisions Related to Machine Learning Resource Efficiency
- https://docs.google.com/document/d/1gOKA8BnlJnsTAELWFSmL7Fl7kJej_UrNH-FVXbZFxGI/edit?usp=sharing
- Focus on this before next meeting as nearer completion