Overview - VLM Run

The VLM Run API is a unified platform for production-ready multimodal AI. Use it to extract structured data from documents, images, videos, and audio — or run complex multi-step workflows with visual agents.

Base URL: https://api.vlm.run/v1
Authentication: Authorization: Bearer <VLMRUN_API_KEY>
Models Supported:
- Requests: vlm-1
- Agent Executions / Chat Completions: vlmrun-orion-1:auto, vlmrun-orion-1:fast, vlmrun-orion-1:pro

See Ways to Use VLM Run for a side-by-side comparison of Requests, Executions, Chat Completions, and the Chat UI.

Access your API keys in our dashboard.

Structured Extraction

Use the Generate endpoints to extract structured JSON from images, documents, audio, and video.

from pathlib import Path
from vlmrun.client import VLMRun
from vlmrun.client.types import PredictionResponse

# Initialize the client
client = VLMRun(api_key="<VLMRUN_API_KEY>")

# Document -> JSON
response: PredictionResponse = client.document.generate(
    file=Path("path/to/document.pdf"),
    model="vlm-1",
    domain="document.invoice",
)

Chat Completions & Agent Executions

Use the Chat Completions endpoint for interactive multi-modal conversations, or the Agent Executions endpoint for batch execution workflows.

from vlmrun.client import VLMRun

# Initialize the VLM Run client
client = VLMRun(api_key="<VLMRUN_API_KEY>")

# Create a chat completion
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What do you see in this image?" },
                {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
            ]
        }
    ],
    max_tokens=1000
)

Health

​Structured Extraction

​Chat Completions & Agent Executions

Structured Extraction

Chat Completions & Agent Executions