Most SDE/ODE diffusion solvers inject uniform white noise throughout the generative trajectory, wasting the model's finite noise budget on frequencies the model already resolved. Colored Noise Sampling (CNS) reframes sampling as targeted, frequency-decoupled energy transfer: it injects noise according to a dynamic, timestep- and frequency-dependent schedule that aligns with the model's inherent spectral bias, steering samples faster toward the data manifold without any retraining.
Key Findings
- Frequency-aware, training-free sampler: CNS replaces uniform white noise with a colored-noise schedule that preferentially allocates energy to unresolved frequency bands, requiring no changes to model weights.
- Consistent fidelity gains: Across architectures (SiT, JiT, FLUX) CNS yields substantial unguided FID reductions on ImageNet-256 (e.g., SiT-XL/2: 8.26 → 6.27; JiT-B/16: 32.39 → 26.69; JiT-H/16: 11.88 → 8.31) and preserves relative improvements under Classifier-Free Guidance.
- Plug-and-play inference substitute: Designed as an inference-time stochastic solver, CNS can be dropped into existing sampling pipelines to improve sample quality without retraining or additional dataset access.
Who It's For and Trade-offs
Great fit if you want to improve sample fidelity at inference time across diffusion-based image generators without retraining — researchers tuning samplers, production teams aiming for better quality per-second, and benchmarks comparing samplers. Look elsewhere if you need strictly deterministic samplers with identical runtime profiles, or if your deployment forbids any extra spectral preprocessing: CNS introduces frequency-aware scheduling and may add modest implementation complexity and extra per-step computation to compute/apply frequency-dependent noise schedules. It also assumes the model exhibits the common spectral bias (low-frequency structures resolved earlier), so gains depend on that property being present in the target architecture.
Where It Fits
CNS sits between generic SDE/ODE solvers (which use uniform white noise) and model-retraining approaches that alter training objectives to shape spectral behavior. Its niche is inference-time quality improvement with minimal workflow disruption — unlike retraining, it delivers immediate benefits; unlike purely deterministic samplers, it leverages stochastic, frequency-targeted energy injection to recover fine detail more efficiently.
