LogoAIAny
Icon for item

nanoGPT

nanoGPT is the simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of minGPT that prioritizes practicality over education. Still under active development, but currently the file train.py reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training.

Introduction

Oops! Something went wrong

[next-mdx-remote-client] error compiling MDX: Unexpected character `3` (U+0033) before name, expected a character that can start a name, such as a letter, `$`, or `_` More information: https://mdxjs.com/docs/troubleshooting-mdx

Information

  • Websitegithub.com
  • AuthorsAndrej Karpathy
  • Published date2022/12/29

Categories

More Items