LogoAIAny
Icon for item

VisCoR-55K Dataset

Provides ~55K multimodal VQA items with matched contrastive pairs and model‑generated rationales across five categories (General, Reasoning, Math, Graph/Chart, OCR), enabling research on faithful visual reasoning and robustness. Train split: 54,844 examples; license unspecified—verify before use.

Introduction

Faithful visual reasoning requires not only correct answers but evidence that a model’s chain of thought aligns with the visual input. VisCoR‑55K supplies paired VQA examples plus contrastive counterparts and synthesized rationales so researchers can train and evaluate whether vision‑language models reason for the right reasons rather than exploit spurious cues.

What Sets It Apart
  • Contrastive counterparts: Each VQA sample is paired with carefully constructed contrastive examples designed to expose superficial shortcuts and probe model sensitivity to small, validity‑changing perturbations—useful for robustness and attribution studies.
  • Generated rationales (VC‑STaR): High‑quality model‑synthesized rationales accompany answers, enabling research on explanation alignment and rationale‑guided fine‑tuning without requiring fully manual rationale annotation.
  • Broad coverage & practical format: ~54.8K train examples across five categories (General, Reasoning, Math, Graph/Chart, OCR), stored in Parquet with image+text modalities for easy pipeline integration and batched processing.
Who It's For

Great fit if you are a researcher or engineer building or evaluating vision‑language models that need finer‑grained checks for reasoning faithfulness, counterfactual robustness, or explainability metrics. It’s especially useful for experiments that compare answer accuracy against explanation alignment or that fine‑tune models with rationale supervision.

Look elsewhere if you require fully human‑verified rationales or a dataset with an explicit, permissive license—VisCoR‑55K’s generated rationales may propagate model biases and the HuggingFace card lists no license, so legal/production use requires additional clearance.

Where It Fits

VisCoR‑55K complements canonical VQA benchmarks (e.g., VQA, GQA, TextVQA) by emphasizing contrastive evaluation and rationale generation rather than raw answer scale. Use it to stress‑test whether improvements in accuracy also improve explanation fidelity, or to bootstrap rationale‑based supervision before collecting human rationales.

Information

  • Websitehuggingface.co
  • Authors5551z, Zhiyu Pan, Yizheng Wu, Jiasheng Hua, Junyi Feng, Shaotian Yan, Bing Deng, Zhiguo Cao, Jieping Ye
  • Published date2026/04/24

Categories