LogoAIAny
Icon for item

LLMs-from-scratch

An open-source GitHub repository by Sebastian Raschka that contains the official code for the book "Build A Large Language Model (From Scratch)". It provides step-by-step PyTorch implementations to build, pretrain, and finetune a GPT-like LLM for educational purposes, along with exercises, bonus material, and companion video content.

Introduction

LLMs-from-scratch — Detailed introduction

LLMs-from-scratch is the official code repository that accompanies the book "Build A Large Language Model (From Scratch)" by Sebastian Raschka. Its primary goal is educational: to teach how modern GPT-like large language models (LLMs) work by building small-but-functional implementations from the ground up using PyTorch. The repository walks readers through the entire pipeline — from basic data handling and tokenization to attention mechanisms, implementing a GPT model, pretraining on unlabeled data, and multiple finetuning strategies.

Key contents and features
  • Chapter-aligned code: Each book chapter has a dedicated folder with a main notebook and supplementary examples (e.g., ch02 for data and tokenizers, ch03 for attention, ch04 for GPT implementation, ch05 for pretraining, ch06–ch07 for various finetuning workflows).
  • Educational focus: Code is written for clarity and learning; many notebooks include exercises and solutions (Appendix C) to reinforce concepts.
  • Practical recipes: Examples include building BPE tokenizers from scratch, multi-head attention implementations, KV cache, FLOPs analysis, memory-efficient weight loading, and performance tips for training LLMs in PyTorch.
  • Pretraining and finetuning: The repo contains scripts and notebooks for pretraining small models and finetuning for classification, instruction-following, and preference-based alignment (including DPO examples and LoRA-based parameter-efficient finetuning).
  • Bonus and extension material: Numerous optional notebooks and folders provide additional experiments and conversions (e.g., Llama/Qwen/Gemma/Olmo-from-scratch examples, dataset utilities, UI examples, user-interface code for interacting with trained models).
  • Companion resources: A 17-hour+ video course is provided as a code-along companion, and the author published a follow-up book/project "Build A Reasoning Model (From Scratch)" that focuses on reasoning, distillation, and reinforcement learning methods for improving model reasoning.
Intended audience and use cases
  • Learners who want an in-depth, hands-on understanding of how LLMs are built and trained.
  • Researchers or engineers who want minimal, from-scratch PyTorch reference implementations of key LLM components.
  • Educators using the book and course for teaching concepts in model architecture, tokenization, pretraining, and finetuning.
Practical notes
  • Hardware: Examples are designed to run on conventional laptops for small models and will use GPUs automatically when available; some bonus material discusses multi-GPU and DDP setups.
  • Get started: Clone the repository, follow the setup/README for environment setup, then open the chapter notebooks to follow the step-by-step implementations.
  • Citation & provenance: The repository is explicitly tied to the Manning-published book (2024) and provides citation information in the README.
Why it matters

By offering clear, well-documented from-scratch implementations, LLMs-from-scratch lowers the barrier to understanding complex transformer-based models. Rather than relying on opaque libraries, readers can inspect and modify core components, which aids learning, experimentation, and teaching.

Information

  • Websitegithub.com
  • AuthorsSebastian Raschka
  • Published date2023/07/23