LogoAIAny
Icon for item

vLLM

vLLM is a high-throughput, memory-efficient inference and serving engine for large language models (LLMs), built to deliver state-of-the-art performance on GPUs with features such as PagedAttention and continuous batching.

Introduction

Oops! Something went wrong

[next-mdx-remote-client] error compiling MDX: Unexpected character `4` (U+0034) before name, expected a character that can start a name, such as a letter, `$`, or `_` More information: https://mdxjs.com/docs/troubleshooting-mdx

Information

  • Websitedocs.vllm.ai
  • AuthorsWoosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joey Gonzalez, Hao Zhang, Ion Stoica
  • Published date2023/06/20

Categories