Overview
Ultimate Vocal Remover GUI (UVR) is an open-source desktop application that provides a user-friendly interface for neural-network-based audio source separation, primarily used to remove or isolate vocals from stereo audio tracks. The project packages state-of-the-art separation weights and models together with an interface and the runtime environment to make advanced source separation accessible to end users without requiring deep technical setup.
Key Features
- Bundled interface and runtime: UVR distribution packages include the application GUI, Python and required dependencies so users can run the tool out-of-the-box.
- Multiple separation models: Integrates MDX-Net, Demucs (various versions), MDX23C weights (trained by contributors) and other models; most weights in the package were trained by UVR core developers except for some Demucs v3/v4 4-stem models.
- Cross-platform support: Provides installers/bundles for Windows, macOS (including Apple Silicon with MPS support) and instructions for Linux installations.
- GPU acceleration: Optimized for NVIDIA CUDA GPUs (recommended >=8GB VRAM; minimum GTX 1060 6GB), has DirectML builds for some AMD/Intel Arc GPUs, and expanded MPS support for Apple Silicon on macOS.
- Audio tooling: Uses FFmpeg for non-WAV formats and Rubber Band for time-stretch / pitch-shift operations.
- Usability: Remembers settings across runs, offers both installer and manual installation paths, and exposes error logs for debugging.
Installation & Platforms
- Windows: Official installers are provided (noted for Windows 10+). A DirectML variant is available for certain AMD/Intel GPUs. UVR must be installed on the system drive (C:) per the official guidance.
- macOS: DMG bundles exist for both arm64 (M1/M2) and Intel macs; MPS acceleration expanded to support several models. macOS-specific notes include workarounds for system security gates if needed.
- Linux: Manual installation instructions are supplied (Debian/Arch examples), and a virtual environment approach is recommended to avoid system Python conflicts.
Models, Performance & Requirements
- Models are computationally intensive: conversion speed and feasibility depend heavily on hardware (GPU is strongly recommended). Model loading and conversion times improve with better GPUs and sufficient VRAM.
- Minimum recommended GPU: NVIDIA GTX 1060 6GB (minimum), NVIDIA GPUs with >=8GB VRAM recommended.
Limitations & Known Issues
- Some AMD GPU support is experimental and maintained in specific branches.
- Non-WAV processing requires a working FFmpeg installation.
- Time-stretch/pitch features require Rubber Band binaries.
- MacOS-specific GUI quirks (historically with Sonoma and Tkinter) have been tracked and patched in releases.
License & Credits
- License: MIT.
- Core developers: Anjok07 and aufr33 (project maintainers). Many community contributors and authors of underlying models are credited (ZFTurbo, Kuielab, Demucs authors, etc.).
Typical Use Cases
- Creating instrumental or karaoke tracks by removing lead vocals.
- Producing stems for remixing and sampling.
- Research and experimentation in audio source separation.
Where to find it
- Official repository (source, releases, installers): the project's GitHub page.
(Adapted from the project's README and release notes.)
