Skip to content

January 24, 2024

January 24, 2024

Present

Geoffrey Fox, Feiyi Wang, Senjuti Dutta, Gregg Barrett, Wes Brewer, Xavier Coubez, Murali Emani,Tom Gibbs, Armstrong Foundjem, Piotr Luszczek, (Otter AI not given admittance as advised by MLCommons)

Apologies

Jeyan Thiyagalingam, Christine Kirkpatrick

Tentative Agenda

New Members

  • Senjuti Dutta Home | Senjuti Dutta studies HCI and Machine Learning at University of Tennessee, Knoxville. The other members introduced ourselves to her.
  • Xavier Coubez reminded us that he spanned particle physics and computer science (deep learning)

Discussion

  • Wes Brewer noted 2024 Symposium | Michigan Institute for Computational Discovery and Engineering on Scientific Foundation Models at MICDE Conference April 2nd & 3rd, 2024, Ann Arbor, MI.
  • Wes discussed work on HPC AI workflows with Shantenu Jha and identification of motifs (patterns of AI use on HPC) and need for benchmarks for these motifs
  • Osmi is one example of such motifs
  • HP Labs Dejan Milojicic is interested
  • Suggested asking Shantenu Jha to talk
  • Feiyi Wang noted that 6 months ago, he purchased a smaller testbed based on 8 NVIDIA H100 with 80GB memory in each GPU. This can be use to look at smaller foundation models and fine-tuning
  • Prepare one page application Advanced computing asserting open source to NCCS National Center for Computational Sciences with funding for collaborato
  • PyTorch FSDP, Megatron and Microsoft DeepSpeed available
  • Frontier has 2 earlier machines used to prepare for it
  • Also a Graphcore AI hardware system
  • NSF NAIRR Democratizing the future of AI R\&D: NSF to launch National AI Research Resource pilot will use Summit+. Note this announcement of NAIRR Pilot came out today. NSF DOE and other partners.
  • Such developments suggest that compute resources will not be the limiting factor
  • Piotr wanted a broad spectrum of benchmarks based on modern applications including Graph based Foundation models as not all applications look like LLMs or Images.
  • Feiyi noted that size depends on the discipline; LLM trillion parameter; Geospatial a few billion; Nasa IBM Prithvi-100M millions
  • In discussing state of the art, it was noted that most people only discussed success and so when a method stops being discussed maybe it is a silent failure!
  • Geoffrey mentione CNN v. Vision Transformers and MAE versus contrastive loss as examples of unclear method comparisons.
  • A list of Foundation models is at Science FM Hub
  • LLM teaches us that generalization and memorization are different.
  • In conclusion Gregg noted Open Science at the nexus of methods, data and testbeds.
  • Xavier followed up after the meeting with
  • The comparison between CNN and Vision Transformers by DeepMind [1]. I didn't check for follow-up papers but will do so. I am not sure whether this has been peer-reviewed, maybe they submitted it to a journal and it's under review, the preprint is rather recent.
  • Regarding the lack of rigorous assessment of what was learned by models in the field of NLP, I was referring to this talk [2] which is part of a larger workshop on the representation of language in brains and machines [3]. The speaker has an ERC Starting Grant to study the emergence of language through the ALiEN project [4].
  • [1] ConvNets Match Vision Transformers at Scale https://arxiv.org/pdf/2310.16764.pdf
  • [2] Marco Baroni,
  • On the Proper Role of Linguistically-Oriented Deep Net Analysis in Linguistic Theorizing | Collège de France
  • [3] The Representation of Language in Brains and Machines, workshop June 24-25 2021 The Representation of Language in Brains and Machines | Collège de France
  • [4] ALiEN EU funded 5 year project Autonomous Linguistic Emergence in Neural Networks