Why this matters
Reproducible ASR evaluation depends on stable, auditable test sets and safe data formats. This Hugging Face dataset packages the Open ASR Leaderboard's ESB test sets into Parquet (avoiding unsafe remote-loading scripts) and sorts splits by audio length so evaluations are deterministic and easier to batch or profile.
What Sets It Apart
- Safe, portable format: converting each test split to Parquet removes the need to run untrusted dataset loading code on your machine — so CI systems and shared compute can ingest the files reliably.
- Length-sorted splits: samples are sorted by audio_length_s (typically descending), which simplifies batching strategies and makes long-form error analysis and stress-testing more reproducible across labs.
- Leaderboard-ready structure: includes the standard ESB test splits (LibriSpeech, Common Voice, VoxPopuli, TED-LIUM, GigaSpeech, SPGISpeech, Earnings-22, AMI) and points users to the ESB leaderboard workflow for submitting model predictions, reducing friction for fair comparisons.
Who It's For and Tradeoffs
Great fit if you run ASR research or model evaluations and need deterministic test sets that are safe to load in automated pipelines. It is especially useful for benchmarking long-form or multilingual ASR systems and for teams that must avoid executing remote code in secure environments.
Look elsewhere if you need training audio/transcriptions for large-scale model training (this package focuses on prepared test splits and diagnostics rather than raw, full training corpora), or if you require datasets that are only available under original gated licenses before accepting terms (Common Voice, GigaSpeech, SPGISpeech still require following their upstream access steps).
Where It Fits
This dataset is positioned between raw corpus archives and evaluation tooling: it reduces friction when moving from dataset discovery to reproducible leaderboard evaluation. For end-to-end training pipelines, pair it with the upstream training splits from the original repositories; for evaluations, use it directly and submit predictions to the ESB leaderboard.
