LogoAIAny
Icon for item

Claude Mythos Distilled 25K

25,000 chat-formatted synthetic SFT examples distilled to emulate the reasoning style and agentic behavior of Anthropic's Claude Mythos, focused on cybersecurity, advanced coding, mathematical reasoning, and long-horizon agent tasks. Includes metadata for targeted curriculum fine-tuning and is Apache-2.0 licensed.

Introduction

Frontier models are often gated or costly; synthetic distillation datasets let practitioners transfer characteristic reasoning patterns into open or smaller weights without direct access to the original system. This dataset bundles 25k chat-style SFT examples with rich metadata to help instruction-tuning workflows approximate Mythos-style capabilities in focused areas (security, coding, formal reasoning, and agentic planning).

What Sets It Apart
  • Distilled voice and structure: examples are authored to reflect an autonomous, multi-step "Mythos-style" decomposition (numbered reasoning, risk matrices, detection heuristics) so fine-tuned models learn structured chain-of-thought-like outputs without exposing proprietary model outputs. This is useful when you need consistency in reply structure for downstream evaluation.
  • Balanced, high-signal categories: explicit splits (cybersecurity ~7k, advanced coding ~5.5k, reasoning ~3k, agentic planning ~3.5k, scientific analysis ~2.5k, general expert QA ~3.5k) allow curriculum or targeted upsampling during SFT/TRL training.
  • Trainer-friendly format and metadata: chat messages plus category, id, source, timestamp enable selective sampling, loss-masking on assistant tokens, and integration with TRL/Axolotl/standard Hugging Face trainers.
  • License and reproducibility: Apache-2.0 license and included generator script support commercial use and reproducible extensions.
Who It's For and Tradeoffs

Great fit if you want to bootstrap instruction-tuned models (Llama-family, Qwen, Mistral, Gemma, etc.) toward structured, multi-step expert answers in security, coding, or agentic workflows and need a compact, metadata-rich synthetic curriculum. Look elsewhere if you require genuine proprietary model outputs, human-preference-aligned labels, or real-world exploit code datasets — this corpus is synthetic and framed defensive-only. Also plan for human preference tuning and careful safety review: the cyber content is defensively oriented but requires governance and evaluation to avoid unintended dual-use behavior.

Where It Fits

Use this dataset as a middle-ground between small open instruction datasets (fast to train but shallow) and inaccessible gated frontier traces (deep but unavailable). It accelerates capability transfer when paired with a smaller amounts of human preference data or real-world corpora for calibration.

Information

Categories