Abdulkareem Alsudais, Armstrong Foundjem, Christine Kirkpatrick, Gary Mazzaferro, Geoffrey Fox, Gregg Barrett, Gregor von Laszewski, Juri Papay, Iris Johnson, Iulia Ibanescu, Javier Toledo, Marco Colombo, Piotr Luszczek, Victor Lu, Wes Brewer
Paper Submission Status: Gregor von Laszewski provided an update, noting a remaining LaTeX error (UTF character 0301) and three minor corrections needed from Armstrong Foundjem before anticipated submission to arXiv today.
Publication Discussion: The group decided a journal, possibly the Benchmarking Journal or High Performance Computing Journal, would be better than a conference due to the paper's length.
Future Focus: "Agentic AI for Science": Geoffrey Fox, Armstrong Foundjem, and Gary Mazzaferro agreed that this should be the future focus for the FOSI group, with Geoffrey Fox planning to prepare materials. The topic was also connected to the CDMI standard and DOE efforts (Genesis, ModCon, AMSAC).
Benchmarking & Challenges: Wes Brewer mentioned the recent SIM AI bench work using AI agents. Geoffrey Fox shared a surprising observation that AI has not yet made a significant contribution to QCD (quantum chromodynamics), suggesting that finding areas without AI improvements is valuable for benchmarking. The group also discussed the need to distinguish between LLMs and other machine learning technologies in benchmarking.
Agentic AI Details: The discussion covered the need for specialized vs. general-purpose agentic AIs and the concept of benchmarking agentic AI using digital twins with explicit explainability.
Next Steps:
Armstrong Foundjem will correct the three minor issues in the paper.
Geoffrey Fox will prepare materials on "agentic AI for science" and assist with future plans after Armstrong Foundjem shares a paper on the topic.
Wes Brewer agreed to ask Sumiandu Sarcar to present on DOE AI efforts and benchmarking challenges.
Gregor von Laszewski will notify the group once the paper is submitted to arXiv.
Baseline AI Capabilities (BASE) develops shared AI tools: multimodal reasoning front ends, agent-based data pipelines, evaluation harnesses, self-improving frameworks, safety/security protocols, and a jointly developed (with AmSC) core agentic framework to orchestrate workflows across HPC and cloud platforms. Together, these reduce duplication and accelerate MTs' adoption of AmSC services and APIs while prioritizing open source.