LogoAIAny
  • Search
  • Collection
  • Category
  • Tag
  • Blog
LogoAIAny

Tag

Explore by tags

LogoAIAny

Learn Anything about AI in one site.

support@aiany.app
Product
  • Search
  • Collection
  • Category
  • Tag
Resources
  • Blog
Company
  • Privacy Policy
  • Terms of Service
  • Sitemap
Copyright © 2025 All Rights Reserved.
  • All

  • 30u30

  • ASR

  • ChatGPT

  • GNN

  • IDE

  • RAG

  • ai-agent

  • ai-api

  • ai-api-management

  • ai-client

  • ai-coding

  • ai-development

  • ai-framework

  • ai-image

  • ai-inference

  • ai-leaderboard

  • ai-library

  • ai-rank

  • ai-serving

  • ai-tools

  • ai-train

  • ai-video

  • ai-workflow

  • AIGC

  • alibaba

  • amazon

  • anthropic

  • audio

  • blog

  • book

  • chatbot

  • chemistry

  • claude

  • course

  • deepmind

  • deepseek

  • engineering

  • foundation

  • foundation-model

  • gemini

  • google

  • gradient-booting

  • grok

  • huggingface

  • LLM

  • math

  • mcp

  • mcp-client

  • mcp-server

  • meta-ai

  • microsoft

  • mlops

  • NLP

  • nvidia

  • openai

  • paper

  • physics

  • plugin

  • RL

  • science

  • translation

  • tutorial

  • vibe-coding

  • video

  • vision

  • xAI

  • xai

Icon for item

TensorFlow Serving

2016
Google

An open-source, production-ready system for serving machine-learning models at scale.

ai-developmentai-libraryai-inferenceai-servinggoogle
Icon for item

vLLM

2023
Woosuk Kwon, Zhuohan Li +7

vLLM is a high-throughput, memory-efficient inference and serving engine for large language models (LLMs), built to deliver state-of-the-art performance on GPUs with features such as PagedAttention and continuous batching.

ai-developmentai-libraryai-inferenceai-serving
Icon for item

SGLang

2023
Lianmin Zheng, Liangsheng Yin +10

Open-source high-performance framework and DSL for serving large language & vision-language models with low-latency, controllable, structured generation.

ai-developmentai-libraryai-inferenceai-serving
Icon for item

Ollama

2023
Jeffrey Morgan, Michael Chiang

A lightweight open-source platform for running, managing, and integrating large language models locally via a simple CLI and REST API.

ai-developmentai-libraryai-inferenceai-servingLLM
Icon for item

TensorRT

2016
NVIDIA

NVIDIA TensorRT is an SDK and tool-suite that compiles and optimizes trained neural-network models for ultra-fast, low-latency inference on NVIDIA GPUs.

ai-developmentai-libraryai-inferenceai-servingnvidia
Icon for item

Ray

2017
RISELab (UC Berkeley), Anyscale Inc.

Ray is an open-source distributed compute engine that lets you scale Python and AI workloads—from data processing to model training and serving—without deep distributed-systems expertise.

ai-developmentai-frameworkai-trainai-serving
Icon for item

KServe

2018
KServe Community

CNCF-incubating model inference platform (formerly KFServing) that provides Kubernetes CRDs for scalable predictive and generative workloads.

ai-developmentai-inferenceai-serving
Icon for item

Triton

2018
NVIDIA

Open-source, high-performance server for deploying and scaling AI/ML models on GPUs or CPUs, supporting multiple frameworks and cloud/edge targets.

ai-developmentai-inferenceai-servingnvidia
Icon for item

OpenVINO

2018
Intel

OpenVINO is an open-source toolkit from Intel that streamlines the optimization and deployment of AI inference models across a wide range of Intel® hardware.

ai-developmentai-inferenceai-serving
Icon for item

ONNX Runtime

2018
Microsoft

Microsoft’s high-performance, cross-platform inference engine for ONNX and GenAI models.

ai-developmentai-inferenceai-serving
Icon for item

BentoML

2019
BentoML Team

Open-source framework for building, shipping and running containerized AI services with a single command.

ai-developmentai-inferenceai-serving
Icon for item

Text-Generation-Inference

2022
Hugging Face

Hugging Face’s Rust + Python server for high-throughput, multi-GPU text generation.

ai-developmentai-inferenceai-serving
  • Previous
  • 1
  • 2
  • Next