Skip to content

ARC-Challenge (Advanced Reasoning Challenge)

← Back to all benchmarks

Date: 2018-03-14

Name: ARC-Challenge Advanced Reasoning Challenge

Domain: Computational Science & AI

Focus: Grade-school science with reasoning emphasis

Task Types: Multiple choice

Metrics: Accuracy

Models: GPT-4, Claude

AI/ML Motif: Reasoning & Generalization

Resources

Benchmark: Visit
Datasets: Hugging Face
Results: ARC-Solvers

Keywords

Citation

  • Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. Think you have solved question answering? try arc, the ai2 reasoning challenge. arXiv:1803.05457v1, 2018. doi:10.48550/arXiv.1803.05457.
@article{allenai:arc,
  author    = {Peter Clark  and Isaac Cowhey and Oren Etzioni and Tushar Khot and
                Ashish Sabharwal and Carissa Schoenick and Oyvind Tafjord},
  title     = {Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge},
  journal   = {arXiv:1803.05457v1},
  year      = {2018},
  doi       = {10.48550/arXiv.1803.05457}
}

Ratings

CategoryRating
Software
5.00
Code is available and well documented for evaluation.
Specification
4.00
Task is clear and inputs/outputs are provided along with format on dataset card.
Dataset
5.00
Data accessible, offers instructions on how to download the data via CLI tools. Splits provided on Huggingface
Metrics
5.00
All questions in the dataset are multiple choice, all have a correct answer
Reference Solution
5.00
Reference solution is available and containerized
Documentation
5.00
Explains all necessary information inside a paper
Average rating: 4.83/5

Radar plot

ARC-Challenge (Advanced Reasoning Challenge) radar

Edit: edit this entry