Ollama lets developers pull, run, and customize state-of-the-art open-source LLMs such as Llama 3, Qwen, and Gemma directly on macOS, Linux, and Windows machines. Its Go-based runtime provides a command-line interface (ollama run, ollama list, etc.) and an OpenAI-compatible REST API, making local models drop-in replacements for cloud endpoints. Beyond basic chat completion, Ollama supports embeddings, tool/function calling, structured JSON outputs, streaming responses, and multi-modal vision models. The project ships pre-built binaries with GPU acceleration (NVIDIA, AMD, Apple Silicon) and can also run in Docker. A growing model library and Python/JavaScript client SDKs simplify integration into RAG pipelines, VS Code extensions, and other AI-powered apps. Founded by Jeffrey Morgan and Michael Chiang (YC W21), Ollama is fully open source under the MIT license and has an active community on GitHub and Discord.
Ollama
A lightweight open-source platform for running, managing, and integrating large language models locally via a simple CLI and REST API.
Introduction
Information
- Websiteollama.ai
- AuthorsJeffrey Morgan, Michael Chiang
- Published date2023/08/01
Categories
More Items
2017
ONNX Project Contributors, Meta (Facebook) +1
ONNX (Open Neural Network Exchange) is an open ecosystem that provides an open source format for AI models, including deep learning and traditional ML. It defines an extensible computation graph model, built-in operators, and standard data types, focusing on inferencing capabilities. Widely supported across frameworks and hardware, it enables interoperability and accelerates AI innovation.
