January 12, 2022
January 12, 2022
Present
Lauren Moos, Aristeidis Tsaris, Gregg Barrett, Gregor von Laszewski, Farzana Yasmin Ahmad, Arjun Shankar, Jeyan Thiyagalingam, Tony Hey, Geoffrey Fox, Hai Ah Nam, Junqi Yin, David Kanter, Juri Papay
Apologies: Murali Emani, Christine Kirkpatrick
Tentative Agenda
- New member introductions
- 15 minute Talk from Lauren Moos: HILPAE: Towards a sample efficient mechanism for effective modeling and control for particle accelerators.
- Adhering to ML Common Rules – Suggested Approach: Jeyan, Geoffrey, and Tony
- Current status of benchmarks – any new results
- Science Benchmarks – Revision II – Jeyan
- Funding Science WG activities – Jeyan & Geoffrey
- AOB
New member introductions
- Farzana Yasmin Ahmad - PhD student from UVA
- Lauren Moos from Special Circumstances discussed next an SBIR project for DOE labs
Presentation
15 minute Talk from Lauren Moos: HILPAE: Towards a sample efficient mechanism for effective modeling and control for particle accelerators. This covered a system to control and optimize accelerator performance. Most of the discussion was on how to make this an attractive DoE proposal. It was noted that “human in the loop” ran counter to goal of AI-based control.
Adhering to ML Common Rules – Suggested Approach
This was led by Jeyan following conversations before the meeting with Jeyan, Geoffrey, and Tony
- Juri and Gregg will be stepping in to form a small sub-group to work
- Tony Hey suggested to contact Steve Farrel (in the absence of Murali)
- Arjun suggested Aris to help this initiative
- Arjun suggested this to be divided in to two aspects: our policy with scientific accuracy as focus vs basic compliance (and to identify the minimal points)
- It was not clear how existing frameworks (SciMLBench - MLCube) can be used as part of the benchmark
- Hai An Nam reminded us of the training rules training_policies/hpc_training_rules.adoc at master (https://github.com/mlcommons/training_policies/blob/master/hpc_training_rules.adoc ) (as modified by HPC) or https://github.com/mlcommons/training_policies/blob/master/training_rules.adoc (original for training) and the tiers of open / closed rules. We should perhaps start with this and suggest our rules that cover both our benchmarks (e.g. must have accuracy as well as performance) and submissions by others
- Christine Kirkpatrick (offline) noted that she would still be happy to brief the group on her observations about the MLCommons benchmarks from a data perspective and what we might learn from it for the sake of the Science benchmarks.
Current status of benchmarks – any new results
Juri shared mlcommons_12_01_2022.pptx some results for STFC benchmarks. These covered two benchmarks Cloudmask and Optical Damage. They looked at compute, memory and power use of GPU’s showing different scenarios for two benchmarks (Cloudmask made poor use of GPU).
Funding Science WG activities
- Jeyan following a conversation with Geoffrey, discussed funding the work we just discussed (and its follow on)
- Tony mentioned that ML Commons membership will cost £15,000 per lab per year for a laboratory
- The benefits of this has to be justified (voting rights and input on expenditure of funds)
- Action: We need to verify with MLCommons about potential benefits.
Science Benchmarks – Revision II
- Jeyan discussed briefly
- Aris discussed HPC working group plans to define the process to add a new benchmark and their expectation of adding another one to make 4.Deferred for the next meeting
- Tony Hey mentioned that we should be careful about committing more benchmarks.
- Arjun agreed that we need to understand the process better before committing more.