Overview
Grok-1 is an open-weight release hosted by xai-org that provides JAX example code to load and run the Grok-1 language model. The repository's primary goal is to let researchers and engineers validate the model and run inference using the provided checkpoints and example scripts.
Model specifications
- Parameters: 314B
- Architecture: Mixture of Experts (MoE) with 8 experts
- Experts utilization: 2 experts used per token
- Layers: 64
- Attention heads: 48 (queries), 8 (keys/values)
- Embedding size: 6,144
- Tokenizer: SentencePiece with 131,072 tokens
- Positional embeddings: Rotary embeddings (RoPE)
- Supports: activation sharding and 8-bit quantization
- Maximum sequence length: 8,192 tokens
What the repository provides
- JAX example code to load and run the Grok-1 checkpoint and sample outputs (run.py example).
- Instructions and commands for downloading the checkpoint either via a provided magnet link or directly from the Hugging Face model repository (xai-org/grok-1).
- Notes on the implementation: the MoE layer implementation prioritizes correctness and ease of validation over runtime efficiency, so it may not be optimal for production inference.
- License: Apache 2.0 (applies to source files and the distributed Grok-1 weights in this release).
How to run (example)
- Clone the repository and install requirements:
git clone https://github.com/xai-org/grok-1.git && cd grok-1
pip install -r requirements.txt- Download checkpoints and place them under
checkpoints/ckpt-0(options provided in the repo):
- Torrent magnet link in the README, or
- Use Hugging Face Hub download commands (requires
huggingface_hub) to fetchckpt-0/*intocheckpoints.
- Run the example script to load the checkpoint and sample from the model:
python run.pyNote: because Grok-1 is 314B parameters, you need a machine with sufficient GPU memory (and appropriate JAX setup) to run the model at reasonable performance. The repository documents that the MoE implementation is not optimized for memory/throughput.
Use cases and audience
This repository is primarily intended for researchers, replicability-focused engineers, and teams who want to inspect and validate the Grok-1 weights, reproduce inference behavior, or use the model for experimentation. It is not an optimized production inference stack — further engineering is needed for efficient MoE inference at scale.
Additional notes
- The README includes both a magnet link for torrent-based download and explicit Hugging Face Hub commands to retrieve the model checkpoint.
- The project is published by the GitHub organization
xai-organd released under the Apache 2.0 license for the repository and included weights as described in the README.
