AIAny - Grok-1

Overview

Grok-1 is an open-weight release hosted by xai-org that provides JAX example code to load and run the Grok-1 language model. The repository's primary goal is to let researchers and engineers validate the model and run inference using the provided checkpoints and example scripts.

Model specifications

Parameters: 314B
Architecture: Mixture of Experts (MoE) with 8 experts
Experts utilization: 2 experts used per token
Layers: 64
Attention heads: 48 (queries), 8 (keys/values)
Embedding size: 6,144
Tokenizer: SentencePiece with 131,072 tokens
Positional embeddings: Rotary embeddings (RoPE)
Supports: activation sharding and 8-bit quantization
Maximum sequence length: 8,192 tokens

What the repository provides

JAX example code to load and run the Grok-1 checkpoint and sample outputs (run.py example).
Instructions and commands for downloading the checkpoint either via a provided magnet link or directly from the Hugging Face model repository (xai-org/grok-1).
Notes on the implementation: the MoE layer implementation prioritizes correctness and ease of validation over runtime efficiency, so it may not be optimal for production inference.
License: Apache 2.0 (applies to source files and the distributed Grok-1 weights in this release).

How to run (example)

Clone the repository and install requirements:

git clone https://github.com/xai-org/grok-1.git && cd grok-1
pip install -r requirements.txt

Download checkpoints and place them under checkpoints/ckpt-0 (options provided in the repo):

Torrent magnet link in the README, or
Use Hugging Face Hub download commands (requires huggingface_hub) to fetch ckpt-0/* into checkpoints.

Run the example script to load the checkpoint and sample from the model:

python run.py

Note: because Grok-1 is 314B parameters, you need a machine with sufficient GPU memory (and appropriate JAX setup) to run the model at reasonable performance. The repository documents that the MoE implementation is not optimized for memory/throughput.

Use cases and audience

This repository is primarily intended for researchers, replicability-focused engineers, and teams who want to inspect and validate the Grok-1 weights, reproduce inference behavior, or use the model for experimentation. It is not an optimized production inference stack — further engineering is needed for efficient MoE inference at scale.

Additional notes

The README includes both a magnet link for torrent-based download and explicit Hugging Face Hub commands to retrieve the model checkpoint.
The project is published by the GitHub organization xai-org and released under the Apache 2.0 license for the repository and included weights as described in the README.

Grok-1

Introduction

Overview

Model specifications

What the repository provides

How to run (example)

Use cases and audience

Additional notes

Information

Categories

Tags

More Items

Tianshou

NautilusTrader

MLX Examples