Skip to content

Index of Benchmarks

CURIE (Scientific Long-Context Understanding, Reasoning and Information Extraction)

Materials Science, High Energy Physics, Biology & Medicine, Chemistry, Climate & Earth Science • Accuracy

Avg rating: 3.33/5

Details

FEABench (Finite Element Analysis Benchmark): Evaluating Language Models on Multiphysics Reasoning Ability

Mathematics • Solve time, Error norm

Avg rating: 3.83/5

Details