LogoAIAny
Icon for item

HealthBench Professional

Benchmark dataset for evaluating clinician-facing chat assistants: physician-authored conversations plus rubric items, use-case and difficulty labels, specialty metadata, and a built-in canary to reduce benchmark contamination. Hosted on Hugging Face under an MIT license.

Introduction

Oops! Something went wrong

[next-mdx-remote-client] error compiling MDX: Unexpected character `1` (U+0031) before name, expected a character that can start a name, such as a letter, `$`, or `_` More information: https://mdxjs.com/docs/troubleshooting-mdx

Information

Categories