LogoAIAny
Icon for item

Tianshou

Tianshou is a high-performance, modular deep reinforcement learning library built on pure PyTorch. It offers both high-level APIs for easy application and detailed procedural APIs for algorithm development. Tianshou supports online/offline RL, multi-agent setups, many standard algorithms (DQN, PPO, SAC, TD3, CQL, etc.), vectorized environments, EnvPool integration, recurrent models, multi-GPU training, and logging integrations (TensorBoard, W&B). It emphasizes software quality and reproducible results.

Introduction

Overview

Tianshou is an open-source reinforcement learning (RL) library implemented in PyTorch. Designed to balance usability and flexibility, it provides two complementary API levels: a high-level interface for application development and experiment configuration, and a low-level procedural (algorithmic) API for researchers implementing new RL algorithms.

Key features
  • Modular design: clear separations between Algorithms, Policies, and training logic, enabling concise and maintainable implementations.
  • Dual APIs:
    • High-level API (ExperimentBuilder, pre-configured training loops) for fast prototyping and running experiments.
    • Procedural API for full control over collection, replay buffers, optimizers and update rules.
  • Wide algorithm coverage: implementations of value-based and policy-gradient methods (DQN, Double/Dueling DQN, C51, QRDQN, IQN, PPO, TRPO, A2C, DDPG, TD3, SAC), offline RL algorithms (BCQ, CQL, CRR, TD3+BC), imitation learning (GAIL), and useful techniques like PER, GAE, HER, ICM.
  • Performance and scale:
    • Vectorized environment support (sync/async) and experimental super-fast EnvPool integration.
    • Multi-GPU training support and optimized components (n-step returns, PER using numba/numpy optimizations).
  • Flexible environment and data types: supports arbitrary observation/action structures (dicts, classes) and recurrent networks for POMDPs.
  • Logging and reproducibility: TensorBoard and WandB logging integrations, thorough test suite that includes full training runs to help ensure reproducible behaviour.
Installation & requirements
  • Hosted on PyPI and conda-forge; requires Python >= 3.11.
  • Recommended developer install via Poetry (repository clone + poetry install), with optional extras for mujoco, envpool, atari, etc.
  • Alternatively: pip install tianshou (PyPI release) or pip install git+https://github.com/thu-ml/tianshou.git@master --upgrade.
Typical usage
  • High-level: configure an ExperimentBuilder (e.g., DQNExperimentBuilder) to declare env factory, training config and algorithm params; call .build().run() to run experiments quickly.
  • Procedural: construct environments, networks, policies and algorithms manually, use Collectors and ReplayBuffers, then call the algorithm's training API for full control.
Who maintains it / citation

Tianshou is developed and maintained by contributors from the Tsinghua AI / THU-ML community and collaborators. If used in publications, authors request citation of the JMLR paper: "Tianshou: A Highly Modularized Deep Reinforcement Learning Library" (JMLR, 2022).

When to use

Use Tianshou when you need:

  • A research-friendly RL codebase that is easy to extend to new algorithms;
  • A practical framework for training RL agents with performant vectorized sampling and integrations (EnvPool, MuJoCo, Atari, PyBullet);
  • An RL library with strong engineering practices and reproducibility-focused tests.

Information

  • Websitegithub.com
  • AuthorsJiayi Weng, Huayu Chen, Dong Yan, Kaichao You, Alexis Duburcq, Minghao Zhang, Yi Su, Hang Su, Jun Zhu, thu-ml (Tsinghua ML group)
  • Published date2018/04/16