stable-diffusion.cpp: Pure C/C++ Diffusion Model Inference
stable-diffusion.cpp is an open-source project that provides a lightweight, efficient implementation for running diffusion-based generative models entirely in C/C++, leveraging the ggml library for tensor operations. Inspired by the success of llama.cpp for language models, this project aims to bring similar portability and performance to image and video generation tasks using diffusion models. It supports a wide array of models, including Stable Diffusion variants (SD1.x, SD2.x, SDXL, SD3/SD3.5), advanced architectures like Flux.1-dev/schnell, Flux.2-dev, Chroma, Qwen Image, Z-Image, Ovis-Image, and video models such as Wan2.1/Wan2.2. Additionally, it handles image editing models like Flux.1-Kontext-dev and Qwen Image Edit.
Key Features and Capabilities
The project stands out for its minimalistic design, requiring no external dependencies beyond standard C/C++ compilers, making it ideal for embedded systems, mobile devices, or resource-constrained environments. Here's a breakdown of its core features:
-
Model Support:
- Image generation: Covers foundational Stable Diffusion models (SD1.x, SD2.x) up to modern ones like SDXL-Turbo, distilled variants, SD3/SD3.5, and specialized models such as Flux for high-fidelity outputs, Chroma for artistic rendering, and Qwen/Z-Image/Ovis for multilingual or efficient generation.
- Image editing: Integrates Flux.1-Kontext-dev for contextual edits and Qwen Image Edit for precise modifications.
- Video generation: Supports Wan2.1 and Wan2.2 for dynamic content creation.
- Additional integrations: PhotoMaker for personalized generation, ControlNet (SD 1.5 base) for guided synthesis, and LoRA for fine-tuning without retraining full models.
-
Advanced Techniques:
- Latent Consistency Models (LCM/LCM-LoRA) for faster sampling.
- TAESD for accelerated latent decoding, reducing memory and time.
- ESRGAN upscaling for enhancing generated images.
- VAE tiling to manage high-resolution outputs with limited VRAM.
- Flash Attention for optimized memory usage during inference.
- Negative prompts and stable-diffusion-webui-style token weighting for refined control.
-
Backend and Platform Compatibility:
- Hardware acceleration: CPU (with AVX/AVX2/AVX512), CUDA (NVIDIA GPUs), Vulkan (cross-platform graphics), Metal (Apple silicon), OpenCL, and SYCL.
- Platforms: Full support for Linux, macOS, Windows; Android via Termux or Local Diffusion.
- Weight formats: PyTorch checkpoints (.ckpt/.pth), Safetensors, and GGUF for quantized models.
-
Sampling Methods: Offers a variety of samplers including Euler A, Euler, Heun, DPM2, DPM++ 2M/2M v2/2S a, and LCM, ensuring flexibility for different quality-speed trade-offs.
-
Reproducibility and Usability: Cross-platform RNG consistency (CUDA mode matches stable-diffusion-webui, CPU mode aligns with ComfyUI). Outputs embed generation parameters in PNG metadata for compatibility with tools like stable-diffusion-webui.
Getting Started
To use stable-diffusion.cpp, download pre-built binaries from the releases page or build from source using the provided guide. Models can be sourced from Hugging Face (e.g., Stable Diffusion v1.5 as a .safetensors file). A simple command like ./bin/sd -m path/to/model.safetensors -p "a lovely cat" generates an image. For performance tuning, refer to the dedicated guide on VRAM/RAM optimization.
Detailed documentation covers model-specific setups (e.g., SDXL, Flux, Qwen), LoRA integration, LCM usage, PhotoMaker personalization, and Docker deployment. Quantization via GGUF enables further efficiency gains.
Ecosystem and Community
The project has inspired bindings in languages like Golang, C#, Python, Rust, and Flutter, facilitating integration into diverse applications. UIs such as Jellybox, Stable Diffusion GUI, Local Diffusion, and LocalAI leverage it as a backend for user-friendly interfaces. With over 4,700 stars on GitHub and active contributions, stable-diffusion.cpp continues to evolve, with recent additions like Z-Image (Dec 2024) and FLUX.2-dev (Nov 2024) highlighting its rapid development.
References include foundational libraries like ggml, diffusers, and stable-diffusion, ensuring compatibility with the broader AI ecosystem. This makes stable-diffusion.cpp a go-to for developers seeking high-performance, portable diffusion inference without the overhead of Python-based frameworks.
