Skip to content

August 9, 2023

August 9, 2023

Present

Geoffrey Fox, Piotr Luszczek, Juri Papay, Feiyi Wang, Wes Brewer, Gregor von Laszewski, Christine Kirkpatrick, Mallikarjun Shankar, Gregg Barrett, Yuhan Rao

Tentative Agenda

Foundation Models

  • The meeting largely discussed this topic
  • Geoffrey Fox updated his previous presentation with information from the TPC Meeting August 2-3 at Argonne https://docs.google.com/presentation/d/1WdWaFyZ6JplDvXeV7aXIirUmUEZ4BPWR3_YTDfn9acg/edit?usp=sharing
  • Feiyi Wang noted that Oak Ridge had trained a Science LLM on 200 million science papers.
  • This used Frontier while Argonne will use Polaris followed by Aurora
  • Yuhan Rao asked if they had looked at Semantic Scholars? They have the corpus of published literatures on AWS S3. - https://www.semanticscholar.org/
  • Oak Ridge is a partner in TPC
  • We discussed the need for descriptors of datasets to inform foundation model pipeline
  • Yuhan (Douglas) Rao noted the relevance of Open Data Cube https://www.opendatacube.org/ as a scientific data API
  • We noted that Foundation models allowed a new perspective on Long Tail versus Big Science. The Foundation model would be resource where long tail data could be accumulated with great value.
  • We discussed a mixture of experts as in LLM ans as needed for different modalities of Science data
  • Geoffrey had talked to Murali Emani at Argonne TPC meeting and Murali thought that the HPC group would be interested in TPC/Foundation models.

Other Items