December 2, 2025 (9.05 pm. ET for Asia-USA)

Present

Dora Cheng, Gary Mazzaferro, Geoffrey Fox, Gregor von Laszewski, Hao Lu, Michael A., Piotr Luszczek, Satoshi Iwata, Victor Lu

Google Meet Notes

Note this used “make notes longer” option
MLC Science WG - 2025/12/03 01:59 GMT - Notes by Gemini
The meeting covered several key topics:

Benchmark Paper Update

The primary benchmark paper is "in principle finished" and has been refined using feedback from a Claude-generated report.
It is currently awaiting formal approval from Fermilab, which is required before publication on arXiv to ensure compliance with export control rules.
Gary Mazzaferro and new member Hao Lu offered to conduct a final review of the manuscript, with Gary aiming to complete his review by Monday.
The paper establishes a formal definition (a superset) for benchmarking that captures complex mixed AI and High-Performance Computing (HPC) tasks.
The group agreed that the current paper is the final version and any further work will be a follow-up paper, such as one focusing on database aspects led by Victor.
Time Series and LLMs Discussion
The discussion centered on time series systems like Salesforce's Merlion (which includes models like Morai and ETS Former), TimeGPT (LLM-based, part of the Neural Forecast portal), and Amazon's Kronos (LLM architecture trained on numerical data).
Geoffrey Fox and Gregor von Laszewski agreed that LLMs generally do not predict time series well because they learn language structure rather than time structure.
Gary Mazzaferro is looking at time series neural networks for a cholera prediction application for South Saharan Africa, noting the complexity of combining time series with geospatial factors like waterway flows, animal migration, and weather conditions.
New Members and Introductions
Geoffrey Fox welcomed three new members: Hao Lu from Salesforce (Responsible AI team), Dora Cheng from Super Micro, and Michael from the state government of New York.
SNIA Podcast and Hardware Issues
Gary Mazzaferro is seeking a second panel guest for a SNIA web podcast on data management for HPC and AI applications.
Geoffrey Fox offered to forward the contact information of his "best student" who is working on data movement for LLMs in both training and inference.
Gary Mazzaferro highlighted a current industry issue: low GPU memory utilization (around 60%) in current hardware architectures and the need to motivate the PCI SIGs to add cache coherency signals to the PCI bus to address this.
Other Benchmarking Efforts
Gregor von Laszewski reported on a practical performance benchmark on YOLO/DarkNet with a Florida group, using the group's carpentry paper to formalize the benchmark and creating an MLPerf-compatible logging infrastructure. He plans to present on this work in the second week of January.
The next MLC Science WG meeting is scheduled for next Wednesday at 11:05 Eastern.

New Members

Hao Lu https://www.linkedin.com/in/lu-h/ works in Responsible AI at Salesforce in Bellevue, Washington, He is a Machine Learning Engineer developing models for automated content moderation of millions of TikTok videos uploade daily.
Dora Cheng https://www.linkedin.com/in/dora-cheng-2675a3174/ works for Supermicro in New Taipei City, Taiwan
Michael A. https://www.linkedin.com/in/dsbrain/ graduated from University of Virginia Data Science program and now works with the State government of New York

Discussion

The Carpentry paper https://www.overleaf.com/read/xrysvzdnyjgt#a2ff11
The Time Series Summary Time Series Resources Summary Table