DINOv3

DINOv3 is Meta AI Research's reference PyTorch implementation and model collection for a family of self-supervised vision foundation models. It provides high-resolution dense patch features, multiple pretrained backbones (ViT and ConvNeXt variants), pretrained heads for classification/detection/segmentation/depth, and integration examples for PyTorch Hub and Hugging Face. The repo contains training and evaluation scripts, notebooks, and instructions to obtain model weights.

Visit Website

Introduction

Overview

DINOv3 is a reference implementation and model release from Meta AI Research (FAIR) for a family of self-supervised vision foundation models producing high-quality dense patch-level features. The project focuses on versatile vision backbones (ViT and ConvNeXt variants) pretrained on large datasets and adapted to a variety of downstream tasks without or with minimal fine-tuning.

Key features

Backbones: multiple ViT sizes (including distilled variants and very large ViT-7B) and ConvNeXt variants with pretrained weights for different pretraining corpora (e.g., LVD-1689M for web images, SAT-493M for satellite imagery).
Dense features: the models produce high-resolution, patch-wise embeddings suitable for dense tasks such as segmentation, dense matching, and tracking.
Pretrained heads: released heads and examples for image classification, detection (COCO), segmentation (ADE20K), and depth estimation (SYNTHMIX/NYUv2), plus zero-shot setups (dino.txt).
Integration: explicit support and usage examples for PyTorch Hub, Hugging Face Transformers/Hub, and third-party libraries (timm). The README documents pipelines for feature extraction via Transformers and loading via torch.hub.
Notebooks and demos: several example notebooks (PCA visualization, foreground segmentation, dense/sparse matching, segmentation tracking, dino.txt zero-shot segmentation) with Colab links to help users get started.
Training & evaluation: full training and evaluation scripts, multi-stage recipes for large-scale models (including pretraining, gram anchoring, high-resolution adaptation for ViT-7B), and instructions for reproducing paper results.
Licensing & access: code and model weights released under the repository license (DINOv3 License). Some model weights require requesting access and downloading via provided URLs; the README advises using command-line tools like wget for the downloads.

Typical use cases

Extracting high-quality patch features for dense vision tasks (segmentation, matching, tracking).
Using pretrained backbones as drop-in feature extractors for downstream classifiers, detectors, or segmentation heads.
Research and development that requires large self-supervised vision models and reproduction of published experiments.

Practical notes

The repo expects modern PyTorch (README indicates PyTorch >= 2.7.1) and is tested in Linux environments; CUDA-enabled installations are recommended for performance.
Hugging Face and timm support are noted in the repository, enabling convenient model loading and inference pipelines.
Pretrained weights are organized by backbone and pretraining dataset; some large checkpoints (e.g., ViT-7B) and classifier/detector/segmentor heads are provided as separate downloads.

References

Associated paper: arXiv:2508.10104 (DINOv3).
Official project page / blog: Meta AI DINOv3 resources.

Back

Information

Websitegithub.com
AuthorsMeta AI Research (FAIR), Oriane Siméoni, Huy V. Vo, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michaël Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timothée Darcet, Théo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie, Julien Mairal, Hervé Jégou, Patrick Labatut, Piotr Bojanowski
Published date2025/08/07

More Items

Fooocus

2023

lllyasviel, mashb1t +5

Fooocus is an open-source offline image-generation application built on Stable Diffusion XL with a Gradio-based local UI. It focuses on simplifying the user workflow so users can concentrate on prompts and images rather than manual parameter tuning. Fooocus includes a GPT-2-based prompt expansion, advanced samplers and refiner integration, image-prompt algorithms, inpainting/outpainting, upscaling, inline LoRA support, wildcards and other conveniences for high-quality results with modest setup.

ai-image github AIGC ai-tools

NautilusTrader

2018

Nautech Systems Pty Ltd

NautilusTrader is an open-source, high-performance event-driven algorithmic trading platform and backtester by Nautech Systems. Its Rust-based core with Python bindings provides parity between research/backtest and production/live deployments, supports multi-venue and multi-asset strategies, advanced order types, optional high-precision numeric modes, and is fast enough to be used to train AI trading agents (RL/ES).

mlops ai-train ai-development ai-library github+4

MLX Examples

2023

ml-explore (GitHub organization)

MLX Examples is an open-source repository by the ml-explore organization that provides runnable examples for the MLX framework. It includes examples across text (LLaMA, Mistral, T5, BERT, MoE, LoRA/QLoRA), image (FLUX, Stable Diffusion/SDXL, ResNets, CVAE), audio (Whisper, EnCodec, MusicGen), multimodal (CLIP, LLaVA, SAM) and other model types. The repo is intended to help developers and researchers learn MLX workflows for training, fine-tuning, generation, and inference, and links to MLX community checkpoints on Hugging Face.

github ai-framework ai-train mlops ai-image+2

DINOv3

Introduction

Overview

Key features

Typical use cases

Practical notes

References

Information

Categories

Tags

More Items

Fooocus

NautilusTrader

MLX Examples