Provides differentiable, GPU-accelerated computer-vision operators and geometric building blocks on top of PyTorch; includes 500+ ops, augmentation pipelines, and pre-trained models for detection, matching, and segmentation—suitable for research and production vision pipelines.
Provides modular PyTorch pipelines and tools for training and running diffusion models across image, video, and audio. Ships ready pipelines (Stable Diffusion, img2img, inpainting, video), hardware optimizations, safety checks, and community examples — good for researchers and product teams.
Generates anime-style and other non-photorealistic illustrations from text prompts. A 2B-parameter diffusion base preview trained on millions of anime images (and ~800k non-anime art) and released under a non-commercial license; best used in ComfyUI around ~1MP resolution.
Performs feed‑forward streaming 3D reconstruction from image sequences, combining coordinate grounding, dense geometric cues and trajectory memory to correct long‑range drift; uses paged KV‑cache attention for ~20 FPS inference at 518×378 and supports sequences >10,000 frames.