Skip to content

SPIQA (Scientific Paper Image Question Answering)

← Back to all benchmarks

Date: 2024-07-12

Name: SPIQA Scientific Paper Image Question Answering

Domain: Computational Science & AI

Focus: Multimodal QA on scientific figures

Task Types: Question answering, Multimodal QA, Chain-of-Thought evaluation

Metrics: Accuracy, F1 score

Models: Chain-of-Thought models, Multimodal QA systems

AI/ML Motif: Multimodal Reasoning

Resources

Benchmark: Visit
Datasets: Hugging Face

Keywords

Citation

  • Xiaoyan Zhong, Yijian Gao, and Suchin Gururangan. Spiqa: scientific paper image question answering. 2024. URL: https://arxiv.org/abs/2407.09413.
@misc{zhong2024spiqa,
  title={SPIQA: Scientific Paper Image Question Answering},
  author={Zhong, Xiaoyan and Gao, Yijian and Gururangan, Suchin},
  year={2024},
  url={https://arxiv.org/abs/2407.09413}
}

Ratings

CategoryRating
Software
0.00
Not provided
Specification
5.00
Task administration clearly defined; prompt instructions explicitly given, no ambiguity in format or scope.
Dataset
5.00
Dataset is available (via paper/appendix), includes train/test/valid split. FAIR-compliant with minor gaps in versioning or access standardization.
Metrics
5.00
Uses quantitative metrics (Accuracy, F1) aligned with the task
Reference Solution
2.00
Multiple model results (e.g., GPT-4V, Gemini) reported; baselines exist, but full runnable code not confirmed for all.
Documentation
5.00
All information provided in paper
Average rating: 3.67/5

Radar plot

SPIQA (Scientific Paper Image Question Answering) radar

Edit: edit this entry