Programmatically author, schedule, and monitor data workflows using Python-defined DAGs. Features modular executors, rich provider/operator ecosystem (Kubernetes, AWS, GCP), and built-in scheduling/monitoring for batch and ML pipelines.
Orchestrates and scales Python-based AI/ML workloads from laptop to thousands of GPUs — exposing task and actor primitives plus high-level libraries for training, hyperparameter tuning, serving, RL, and data processing. Designed for heterogeneous accelerators and production ML pipelines.
Stores and indexes high-dimensional embeddings for scalable ANN vector search. Distributed, Kubernetes-native architecture with multiple index types, GPU acceleration, and hybrid dense+sparse search — suitable for semantic search, RAG, and recommendation pipelines.