AI Train2023

LLaMA-Factory

Fine-tunes 100+ LLMs and VLMs from one config file or a no-code web UI, unifying LoRA, QLoRA, full tuning, DPO, PPO, KTO and ORPO behind a single interface. Bundles GaLore, Unsloth, FlashAttention-2 and 2-8bit quantization to fit a single 24GB GPU.

Visit Website

Introduction

Open-source fine-tuning has a fragmentation problem: every new method (LoRA, DPO, GaLore, Unsloth) ships as its own repo with its own scripts, and stitching them together for one model is where most projects stall. LLaMA-Factory's bet is that the recipe, not the code, should be what you edit — the same YAML config or web form drives pre-training, SFT, reward modeling, and every major preference-optimization algorithm across 100+ model families.

What Sets It Apart

One config surface spans the full pipeline (pre-train through PPO/DPO/KTO/ORPO/SimPO), so switching training paradigms is a field change, not a rewrite.
LlamaBoard, the built-in web UI, lets non-coders launch and monitor runs — rare among training frameworks that assume CLI fluency.
Aggressive efficiency stack (GaLore, BAdam, Unsloth, Liger Kernel, FlashAttention-2, 2-8bit quantization) targets single-GPU reality: long-sequence tuning at ~50% the memory of FlashAttention-2 on a 24GB card.
Breadth is the moat — LLaMA, Qwen, Mistral, Gemma, DeepSeek, GLM, Phi and dozens more stay current as upstream models ship.

Who It's For

Great fit if you want to try several fine-tuning strategies on consumer or single-node hardware without gluing together five toolkits, or you need teammates without ML-infra skills to run experiments via a UI. Look elsewhere if you need a bespoke, custom-built training loop with fine-grained control over the optimizer internals, or you're operating at massive multi-node scale where a purpose-built distributed stack pays off more than a unified wrapper.

Back

Information

Websitegithub.com
OrganizationsBeihang University, Peking University
Authorshiyouga
Published date2023/05/28

More Items

AI Train2025

PRIME-RL

Prime Intellect

An asynchronous, high-throughput framework for large-scale reinforcement learning and agentic training that scales to 1T+ MoE models and 1000+ GPUs, with native verifiers integration, end-to-end SFT/RL/evals, and Slurm/Kubernetes deployment; requires NVIDIA GPUs.

RL agent-skills mLOps ai-train pytorch+3

AI Agent2026

SkillOpt

Yang Yifan, Gong Ziyang +8Microsoft

Trains reusable natural-language 'skills' for frozen LLM agents by optimizing the skill document in text-space — using trajectory-driven edits, validation-gated updates, and deployable best_skill.md artifacts. Multi-backend, zero inference-time cost at deployment, designed for iterative, validation-led skill improvement.

agent-skills ai-agent ai-train llm python+6

AI Train2023

NVIDIA PhysicsNeMo

NVIDIA

Modular PyTorch-based framework for building, training, and deploying physics-informed ML models (neural operators, PINNs, GNNs, diffusion). Provides GPU‑optimized training, domain-specific datapipes for meshes/point clouds, distributed scaling and a model zoo.

nvidia physics pytorch ai-framework ai-train+6