Tag
Explore by tags
2024
MADSys Lab, Tsinghua University, Approaching.AI +17
KTransformers is a flexible framework for experiencing cutting-edge optimizations in LLM inference and fine-tuning, focusing on CPU-GPU heterogeneous computing. It consists of two core modules: kt-kernel for high-performance inference kernels and kt-sft for fine-tuning. The project supports various hardware and models like DeepSeek series, Kimi-K2, achieving significant resource savings and speedups, such as reducing GPU memory for a 671B model to 70GB and up to 28x acceleration.
