Benchmark for long-horizon computer-use agents that must orchestrate GUI, CLI, and code operations within single trajectories across 114 real-world tasks. Evaluated on a real Ubuntu desktop and paired with a trajectory-aware judge that inspects deliverables, artifacts, and action traces—revealing a top PassRate of ~41.2%.