LitServe

**Concise Description:** Deploy agents, models, RAG & pipelines easily. MCP server simplifies AI deployment. No YAML/MLOps needed.

3,383

231

<div align='center'>

<h1>
  The Easiest Way to Deploy Agents, MCP Servers, RAG, Pipelines, and Any Model.
  <br/>
  No MLOps. No YAML.
</h1>

<img alt="Lightning" src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/ls_banner2.png" width="800px" style="max-width: 100%;">

</div>

LitServe lets you serve any model (vision, audio, text) and build full AI systems - agents, chatbots, MCP servers, RAG, pipelines - with full control, batching, multi-GPU, streaming, custom logic, and multi-model support, all without YAML.  Unlike most serving engines that serve one model with rigid abstractions, LitServe gives you the flexibility to build complex AI systems.

Self-host or deploy in one-click to [Lightning AI](https://lightning.ai/).

<div align='center'>

✅ Build full AI systems ✅ 2× faster than FastAPI ✅ Agents, RAG, pipelines, more
✅ Custom logic + control ✅ Any PyTorch model ✅ Self-host or managed
✅ Multi-GPU autoscaling ✅ Batching + streaming ✅ BYO model or vLLM
✅ No MLOps glue code ✅ Easy setup in Python ✅ Serverless support


<div align='center'>

[![PyPI Downloads](https://static.pepy.tech/badge/litserve)](https://pepy.tech/projects/litserve)
[![Discord](https://img.shields.io/discord/1077906959069626439?label=Get%20help%20on%20Discord)](https://discord.gg/WajDThKAur)
![cpu-tests](https://github.com/Lightning-AI/litserve/actions/workflows/ci-testing.yml/badge.svg)
[![codecov](https://codecov.io/gh/Lightning-AI/litserve/graph/badge.svg?token=SmzX8mnKlA)](https://codecov.io/gh/Lightning-AI/litserve)
[![license](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/Lightning-AI/litserve/blob/main/LICENSE)

</div>
</div>

<div align="center">
  <div style="text-align: center;">
    <a href="#quick-start">Quick Start</a> •
    <a href="#featured-examples">Examples</a> •
    <a href="#features">Features</a> •
    <a href="#performance">Performance</a> •
    <a href="#host-anywhere">Hosting</a> •
    <a href="https://lightning.ai/docs/litserve">Docs</a>
  </div>
</div>

<div align="center">
<a href="https://lightning.ai/docs/litserve/home/get-started">
  <img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/get-started-badge.svg" height="36px" alt="Get started"/>
</a>
</div>

## Quick Start

Install LitServe via pip ([more options](https://lightning.ai/docs/litserve/home/install)):

```bash
pip install litserve

Examples:

Inference Pipeline Example: Toy inference pipeline with multiple models.
Agent Example: Minimal agent to fetch the news (with OpenAI API).
(Advanced examples)

Inference Pipeline Example

import litserve as ls

# Define the API to include any number of models, databases, etc.
class InferencePipeline(ls.LitAPI):
    def setup(self, device):
        self.model1 = lambda x: x**2
        self.model2 = lambda x: x**3

    def predict(self, request):
        x = request["input"]
        # Perform calculations using both models
        a = self.model1(x)
        b = self.model2(x)
        c = a + b
        return {"output": c}

if __name__ == "__main__":
    # 12+ features like batching, streaming, etc.
    server = ls.LitServer(InferencePipeline(max_batch_size=1), accelerator="auto")
    server.run(port=8000)

Deploy for free to Lightning cloud (or self-host anywhere):

# Deploy for free with autoscaling, monitoring, etc.
lightning deploy server.py --cloud

# Or run locally (self host anywhere)
lightning deploy server.py
# python server.py

Test the server: Simulate an HTTP request (run this on any terminal):

curl -X POST http://127.0.0.1:8000/predict -H "Content-Type: application/json" -d '{"input": 4.0}'

Agent Example

import re, requests, openai
import litserve as ls

class NewsAgent(ls.LitAPI):
    def setup(self, device):
        self.openai_client = openai.OpenAI(api_key="OPENAI_API_KEY")

    def predict(self, request):
        website_url = request.get("website_url", "https://text.npr.org/")
        website_text = re.sub(r'<[^>]+>', ' ', requests.get(website_url).text)

        # Ask the LLM to tell you about the news
        llm_response = self.openai_client.chat.completions.create(
           model="gpt-3.5-turbo",
           messages=[{"role": "user", "content": f"Based on this, what is the latest: {website_text}"}],
        )
        output = llm_response.choices[0].message.content.strip()
        return {"output": output}

if __name__ == "__main__":
    server = ls.LitServer(NewsAgent())
    server.run(port=8000)

Test it:

curl -X POST http://127.0.0.1:8000/predict -H "Content-Type: application/json" -d '{"website_url": "https://text.npr.org/"}'

Key Benefits

Here are a few key benefits of using LitServe:

Deploy any pipeline or model: Agents, pipelines, RAG, chatbots, image models, video, speech, text, etc.
No MLOps glue: LitAPI lets you build full AI systems (multi-model, agent, RAG) in one place (more).
Instant setup: Connect models, databases, and data in a few lines with setup() (more).
Optimized: Autoscaling, GPU support, and fast inference included (more).
Deploy anywhere: Self-host or one-click deploy with Lightning (more).
FastAPI for AI: Built on FastAPI but optimized for AI - 2× faster with AI-specific multi-worker handling (more).
Expert-friendly: Use vLLM, or build your own with full control over batching, caching, and logic (more).

⚠️ Not a vLLM or Ollama alternative out of the box. LitServe gives you lower-level flexibility to build what they do (and more) if you need it.

Featured Examples

Here are examples of inference pipelines for common model types and use cases.

**Toy model:**      <a href="#define-a-server">Hello world</a>
**LLMs:**           <a href="https://lightning.ai/lightning-ai/studios/deploy-llama-3-2-vision-with-litserve">Llama 3.2</a>, <a href="https://lightning.ai/lightning-ai/studios/openai-fault-tolerant-proxy-server">LLM Proxy server</a>, <a href="https://lightning.ai/lightning-ai/studios/deploy-ai-agent-with-tool-use">Agent with tool use</a>
**RAG:**            <a href="https://lightning.ai/lightning

Repository

Lightning-AI

Lightning-AI/LitServe

Created

December 12, 2023

Updated

July 7, 2025

Language

Python