MLOps2014

Apache Airflow

Programmatically author, schedule, and monitor data workflows as Python-defined DAGs; the scheduler handles dependencies, retries, and backfills. Pluggable executors (Local, Celery, Kubernetes) and a broad provider ecosystem for AWS, GCP, and databases.

Visit Website

Introduction

Before Airflow, most data teams stitched cron jobs together with brittle shell scripts and found out about failures only when a downstream report came up empty at 3 a.m. Airflow's real move was to treat a pipeline as code: a workflow is a Python object whose dependency graph, retry policy, and run history are all inspectable, versionable, and testable like any other software.

What Sets It Apart

Pipelines are ordinary Python, so you generate tasks in loops, unit-test them, and diff them in code review instead of clicking through a GUI.
The scheduler understands the DAG, not just a clock: it resolves upstream/downstream dependencies, retries failed tasks, and backfills historical windows on demand.
Executors are swappable — run locally for dev, scale out on Celery, or burst onto Kubernetes — without rewriting the pipeline logic.
A deep provider/operator ecosystem (AWS, GCP, Azure, Snowflake, dbt, Spark, and hundreds more) means most integrations are configuration, not glue code.

Who It's For

Great fit if you orchestrate scheduled, batch-oriented data and ML pipelines and want dependencies, observability, and history as first-class concerns. Look elsewhere if your work is low-latency streaming or event-driven sub-second reactions — Airflow is built around scheduled batch intervals, and its scheduler latency and operational footprint are real costs for very simple or very real-time needs.

Back

Information

Websitegithub.com
OrganizationsApache Software Foundation, Airbnb
AuthorsApache Software Foundation, Maxime Beauchemin (originated at Airbnb)
Published date2014/10/01

More Items

AI Deploy2026

Openship

Deploy and manage applications and containers to your own servers or Openship Cloud from a single desktop, web, or CLI interface. Built-in CI/CD with push-to-deploy and preview environments, automatic SSL, managed databases, CDN, backups, and multi-node portability for VPS-to-production workflows.

ai-deploy mLOps mcp docker cli+5

AI API2026

CPA Manager Plus

seakee

Self-hosted CPA / CLIProxyAPI management and observability panel that stores request history, tracks cost/usage/quota, and centralizes provider/credential/OAuth and plugin management. Designed for local analytics, failure diagnosis and account automation without telemetry.

ai-api-management mLOps docker sqlite go+9

Reinforcement Learning Papers2026

LongStraw: Long-Context RL Beyond 2M Tokens under a Fixed GPU Budget

Changhai Zhou, Kieran Liu +18

Enables RL post-training with million-token prompts under a fixed GPU budget by evaluating shared prompt state without autograd, retaining only minimal model state, and replaying short response branches; instantiated as GRPO and demonstrated on Qwen3.6-27B and GLM-5.2 up to multi-million token execution.

RL llm qwen mLOps ai-train+1