ARC-Challenge (Advanced Reasoning Challenge)

← Back to all benchmarks

Date: 2018-03-14

Name: ARC-Challenge Advanced Reasoning Challenge

Domain: Computational Science & AI

Focus: Grade-school science with reasoning emphasis

Task Types: Multiple choice

Metrics: Accuracy

Models: GPT-4, Claude

AI/ML Motif: Reasoning & Generalization

Resources

Benchmark: Visit

Datasets: Hugging Face

Results: ARC-Solvers

Keywords

grade-school science QA challenge set reasoning

Citation

Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. Think you have solved question answering? try arc, the ai2 reasoning challenge. arXiv:1803.05457v1, 2018. doi:10.48550/arXiv.1803.05457.

@article{allenai:arc,
  author    = {Peter Clark  and Isaac Cowhey and Oren Etzioni and Tushar Khot and
                Ashish Sabharwal and Carissa Schoenick and Oyvind Tafjord},
  title     = {Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge},
  journal   = {arXiv:1803.05457v1},
  year      = {2018},
  doi       = {10.48550/arXiv.1803.05457}
}

Ratings

CategoryRating

Software

5.00

Code is available and well documented for evaluation.

Specification

4.00

Task is clear and inputs/outputs are provided along with format on dataset card.

Dataset

5.00

Data accessible, offers instructions on how to download the data via CLI tools. Splits provided on Huggingface

Metrics

5.00

All questions in the dataset are multiple choice, all have a correct answer

Reference Solution

5.00

Reference solution is available and containerized

Documentation

5.00

Explains all necessary information inside a paper

Average rating: 4.83/5

Radar plot

$ARC-Challenge (Advanced Reasoning Challenge) radar$

Edit: edit this entry