Objective benchmark leaderboard from the OpenCompass community, scoring LLMs and LVLMs across 100+ datasets in five capability dimensions.
Elo-style leaderboard where millions of crowd votes rank AI chatbots via blind head-to-head “battle mode” comparisons.
CompassRank is the public leaderboard of the OpenCompass evaluation suite. It offers a reproducible, fully open pipeline that tests large language and multimodal models on >70 benchmarks (~400 k questions) covering knowledge, reasoning, coding, mathematics and instruction following.
All configs, datasets and reports are Apache-2.0 licensed; contributors can add new models or benchmarks via pull request.