KTransformers is a flexible framework for experiencing cutting-edge optimizations in LLM inference and fine-tuning, focusing on CPU-GPU heterogeneous computing. It consists of two core modules: kt-kernel for high-performance inference kernels and kt-sft for fine-tuning. The project supports various hardware and models like DeepSeek series, Kimi-K2, achieving significant resource savings and speedups, such as reducing GPU memory for a 671B model to 70GB and up to 28x acceleration.
A lightweight open-source platform for running, managing, and integrating large language models locally via a simple CLI and REST API.
ONNX (Open Neural Network Exchange) is an open ecosystem that provides an open source format for AI models, including deep learning and traditional ML. It defines an extensible computation graph model, built-in operators, and standard data types, focusing on inferencing capabilities. Widely supported across frameworks and hardware, it enables interoperability and accelerates AI innovation.
Streamlit is an open-source app framework that turns Python scripts into shareable web apps in minutes. It enables data scientists and AI/ML engineers to build interactive data apps like dashboards, reports, or chat apps using pure Python, without front-end experience.