> ## Documentation Index
> Fetch the complete documentation index at: https://docs.vlm.run/llms.txt
> Use this file to discover all available pages before exploring further.

# Chat Completions

> Handle chat completion requests.

Supports both authenticated and public (unauthenticated) requests.
Guest users are limited to 10 chats/day per browser id.
Authenticated free users are not subject to a daily chat cap after they sign in.

<RequestExample>
  ```python Python theme={"theme":{"light":"github-light","dark":"dark-plus"}}
  !pip install vlmrun

  from vlmrun.client import VLMRun

  # Initialize the VLM Run client
  client = VLMRun(api_key="<VLMRUN_API_KEY>")

  # Create a chat completion
  response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[{"role": "user", "content": "Who are you and what can you do?"}],
    temperature=0.7,
  )
  ```

  ```python Python (OpenAI SDK) theme={"theme":{"light":"github-light","dark":"dark-plus"}}
  import openai

  # Initialize the OpenAI client
  client = openai.OpenAI(
      base_url="https://api.vlm.run/v1/openai",
      api_key="<VLMRUN_API_KEY>"
  )

  # Create a chat completion
  response = client.chat.completions.create(
      model="vlmrun-orion-1:auto",
      messages=[{"role": "user", "content": "Who are you and what can you do?"}],
      temperature=0.7,
  )
  ```

  ```typescript Node.js SDK theme={"theme":{"light":"github-light","dark":"dark-plus"}}
  import { VlmRun } from "vlmrun";

  // Initialize the VLM Run client
  const client = new VlmRun({
    baseURL: "https://api.vlm.run/v1",
    apiKey: "<VLMRUN_API_KEY>"
  });

  // Create a chat completion
  const response = await client.agent.completions.create({
    model: "vlmrun-orion-1:auto",
    messages: [{ role: "user", content: "Who are you and what can you do?" }],
    temperature: 0.7,
  });
  console.log(response);
  ```
</RequestExample>


## OpenAPI

````yaml POST /v1/openai/chat/completions
openapi: 3.1.0
info:
  title: VLM Run Unified Server
  description: Unified server for VLM Run Agent and API
  termsOfService: https://vlm.run/terms-of-service
  contact:
    name: VLM Run Support Team
    url: https://vlm.run/
    email: support@vlm.run
  version: 2026-05-19.0
servers: []
security: []
paths:
  /v1/openai/chat/completions:
    post:
      tags:
        - chat
      summary: Chat Completions
      description: >-
        Handle chat completion requests.


        Supports both authenticated and public (unauthenticated) requests.

        Guest users are limited to 10 chats/day per browser id.

        Authenticated free users are not subject to a daily chat cap after they
        sign in.
      operationId: chat_completions_v1_openai_chat_completions_post
      parameters:
        - name: user-agent
          in: header
          required: false
          schema:
            anyOf:
              - type: string
              - type: 'null'
            title: User-Agent
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/AgentChatCompletionsRequest'
      responses:
        '201':
          description: Successful Response
          content:
            application/json:
              schema: {}
        '422':
          description: Validation Error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HTTPValidationError'
      security:
        - HTTPBearer: []
components:
  schemas:
    AgentChatCompletionsRequest:
      properties:
        id:
          type: string
          title: Id
          description: ID of the completion
        model:
          anyOf:
            - type: string
              enum:
                - vlmrun-orion-1
                - vlmrun-orion-1:lite
                - vlmrun-orion-1:auto
                - vlmrun-orion-1:fast
                - vlmrun-orion-1:pro
                - vlmrun-orion-2
                - vlmrun-orion-2:lite
                - vlmrun-orion-2:auto
                - vlmrun-orion-2:qwen3.6-35b-a3b
                - vlmrun-orion-2:gemma4-26b-a4b
                - vlmrun-orion-2:kimi-2.6
                - vlmrun-orion-2:gpt-5.5
                - vlmrun-orion-2:claude-opus-4.8
                - vlmrun-orion-2:fast
                - vlmrun-orion-2:pro
            - type: string
            - type: 'null'
          title: Model
          description: >-
            VLM Run Agent model to use for completion. When omitted, the skill's
            vlmrun.yaml model is used; otherwise the agent default.
        messages:
          items:
            $ref: '#/components/schemas/Message'
          type: array
          title: Messages
          description: Messages to complete
        max_tokens:
          type: integer
          title: Max Tokens
          description: Maximum number of tokens to generate
          default: 32768
        'n':
          anyOf:
            - type: integer
            - type: 'null'
          title: 'N'
          description: Number of completions to generate
          default: 1
        temperature:
          type: number
          title: Temperature
          description: Temperature of the sampling distribution
          default: 0
        top_p:
          type: number
          title: Top P
          description: >-
            Cumulative probability of parameter highest probability vocabulary
            tokens to keep for nucleus sampling
          default: 1
        top_k:
          anyOf:
            - type: integer
            - type: 'null'
          title: Top K
          description: >-
            Number of highest probability vocabulary tokens to keep for
            top-k-filtering
        logprobs:
          anyOf:
            - type: integer
            - type: 'null'
          title: Logprobs
          description: >-
            Include the log probabilities on the logprobs most likely tokens, as
            well the chosen tokens
        stream:
          type: boolean
          title: Stream
          description: Whether to stream the response or not
          default: false
        preview:
          anyOf:
            - type: boolean
            - items:
                type: string
              type: array
            - type: 'null'
          title: Preview
          description: >-
            Preview card selection for the agent. Accepts: ``True`` to enable
            all known card types, a ``list[str]`` of card discriminator names
            (e.g. ``["preview.image", "preview.grid"]``) for a focused set, or
            ``False``/``None`` to disable structured preview output. When
            enabled, the agent streams ``<preview data='{...json...}'/>`` lines
            through ``vlmrun.log.print(...)`` for the chosen card types.
        response_format:
          anyOf:
            - $ref: '#/components/schemas/JSONSchemaResponseFormat'
            - $ref: '#/components/schemas/JSONModeResponseFormat'
            - $ref: '#/components/schemas/JSONSchemaResponseFormatStrict'
            - type: 'null'
          title: Response Format
          description: Format of the response
        service_tier:
          anyOf:
            - type: string
              enum:
                - auto
                - default
                - standard
                - flex
                - priority
            - type: 'null'
          title: Service Tier
          description: >-
            Delivery tier for the request. 'standard'/'default' uses baseline
            rates, 'flex' applies a 50% discount with higher latency, and
            'priority' applies a 1.8x premium for highest reliability. When
            omitted (or 'auto'), the server default ('standard') applies — opt
            into 'flex'/'priority' explicitly.
        session_id:
          type: string
          title: Session Id
          description: Session UUID for persisting the chat history
        metadata:
          anyOf:
            - additionalProperties: true
              type: object
            - type: 'null'
          title: Metadata
          description: >-
            Additional metadata for the request (e.g., dataset_name,
            experiment_id, etc.)
        skills:
          anyOf:
            - items:
                $ref: '#/components/schemas/AgentSkill'
              type: array
            - type: 'null'
          title: Skills
          description: List of agent skills to enable for this request.
        toolsets:
          anyOf:
            - items:
                $ref: '#/components/schemas/AgentToolset'
              type: array
            - type: 'null'
          title: Toolsets
          description: >-
            List of tool categories to enable for this request. Available
            categories: core, code-execution, document, image, image-gen, video,
            viz, web, world-gen. When specified, only tools from these
            categories will be available. For streaming requests: If None, the
            router agent automatically selects tools. For non-streaming
            requests: Defaults to 'core' toolset if not specified.
        models:
          anyOf:
            - items:
                $ref: '#/components/schemas/AgentModel'
              type: array
            - type: 'null'
          title: Models
          description: >-
            List of model-specific tool providers to enable for this request.
            Available models: depth-anything-3, google-gemini-3-analysis,
            google-gemini-3-image, google-gemini-robotics-er, google-veo-3.1,
            meta-sam2, meta-sam3, meta-sam3d, microsoft-omniparser-v2,
            nvidia-cosmos-reason-2-8b, qwen-qwen3-vl-8b, vlm-dots-ocr. Multiple
            models can be selected — their tools are merged and deduplicated.
            Model tools are added on top of the toolset-selected tools.
      additionalProperties: true
      type: object
      required:
        - messages
      title: AgentChatCompletionsRequest
      description: Request payload for the OpenAI chat completions API for VLM Run agents.
    HTTPValidationError:
      properties:
        detail:
          items:
            $ref: '#/components/schemas/ValidationError'
          type: array
          title: Detail
      type: object
      title: HTTPValidationError
    Message:
      properties:
        role:
          type: string
          enum:
            - user
            - assistant
            - system
            - developer
            - tool
          title: Role
          default: user
        content:
          anyOf:
            - type: string
            - items:
                $ref: '#/components/schemas/MessageContent'
              type: array
            - type: 'null'
          title: Content
        name:
          anyOf:
            - type: string
            - type: 'null'
          title: Name
        tool_call_id:
          anyOf:
            - type: string
            - type: 'null'
          title: Tool Call Id
        tool_calls:
          anyOf:
            - items:
                $ref: >-
                  #/components/schemas/ChatCompletionMessageFunctionToolCallParam
              type: array
            - type: 'null'
          title: Tool Calls
      type: object
      title: Message
    JSONSchemaResponseFormat:
      properties:
        type:
          type: string
          const: json_schema
          title: Type
          default: json_schema
        schema:
          additionalProperties: true
          type: object
          title: Schema
          description: JSON schema definition
      type: object
      required:
        - schema
      title: JSONSchemaResponseFormat
      description: Response format for JSON schema mode as per Fireworks AI specification.
    JSONModeResponseFormat:
      properties:
        type:
          type: string
          const: json_object
          title: Type
          default: json_object
      type: object
      title: JSONModeResponseFormat
      description: Response format for JSON object mode as per Fireworks AI specification.
    JSONSchemaResponseFormatStrict:
      properties:
        type:
          type: string
          const: json_schema
          title: Type
          default: json_schema
        json_schema:
          additionalProperties: true
          type: object
          title: Json Schema
          description: JSON schema definition
        name:
          type: string
          title: Name
          description: The name of the JSON schema
        strict:
          type: boolean
          title: Strict
          description: Whether to use strict mode for the JSON schema
          default: true
      type: object
      required:
        - json_schema
        - name
      title: JSONSchemaResponseFormatStrict
      description: Response format for JSON schema mode as per Fireworks AI specification.
    AgentSkill:
      properties:
        type:
          type: string
          title: Type
          description: >-
            The type of the skill. Use 'skill_reference' for DB-stored skills
            referenced by id/name. Use 'inline' to provide the skill as a
            base64-encoded zip bundle.
          default: skill_reference
        skill_id:
          anyOf:
            - type: string
            - type: 'null'
          title: Skill Id
          description: >-
            The unique identifier of the skill — a UUID or a name string (e.g.,
            'pillow', 'batch-processing').
        skill_name:
          anyOf:
            - type: string
            - type: 'null'
          title: Skill Name
          description: >-
            Human-readable skill name for lookup (e.g., 'invoice-extraction').
            Alternative to skill_id. Deprecated in favour of skill_id.
        skill_version:
          anyOf:
            - type: integer
            - type: string
          title: Skill Version
          description: The version of the skill — an integer (e.g. 2) or 'latest'.
          default: latest
        version:
          anyOf:
            - type: integer
            - type: string
            - type: 'null'
          title: Version
          description: 'DEPRECATED: Use ''skill_version'' instead. The version of the skill.'
        name:
          anyOf:
            - type: string
            - type: 'null'
          title: Name
          description: >-
            Human-readable name for the inline skill (used for discovery and
            logging).
        description:
          anyOf:
            - type: string
            - type: 'null'
          title: Description
          description: Short description of what the inline skill does.
        source:
          anyOf:
            - $ref: '#/components/schemas/InlineSkillSource'
            - type: 'null'
          description: >-
            Source payload for inline skills. Contains the base64-encoded zip
            bundle with type, media_type, and data fields.
        bundle:
          anyOf:
            - type: string
            - type: 'null'
          title: Bundle
          description: >-
            DEPRECATED: Use 'source.data' instead. Base64-encoded zip bundle
            containing the skill files (inline skills only).
      type: object
      title: AgentSkill
      description: >-
        A modular capability that extends the agent's functionality.


        Agent Skills are reusable, filesystem-based resources that provide the
        agent

        with domain-specific expertise: workflows, context, and best practices.


        Each skill packages instructions, metadata, and optional resources
        (scripts,

        templates, snippets) that the agent uses automatically when relevant.


        Two modes are supported:


        1. **Referenced skills** (``type="skill_reference"``) – Provide
        ``skill_id``
           (UUID or name) and optionally ``skill_version`` (integer or ``"latest"``).

           .. code-block:: json

               {"type": "skill_reference", "skill_id": "pillow", "skill_version": "latest"}

        2. **Inline skills** (``type="inline"``) – Supply ``name``,
        ``description``,
           and a ``source`` object containing the base64-encoded zip bundle.  The zip
           must contain exactly one ``SKILL.md`` file.  No database lookup is required.

           .. code-block:: json

               {
                   "type": "inline",
                   "name": "csv-insights",
                   "description": "Summarize CSV files.",
                   "source": {
                       "type": "base64",
                       "media_type": "application/zip",
                       "data": "<base64-zip>"
                   }
               }

           Legacy format with flat ``bundle`` field is also accepted for backward
           compatibility.
    AgentToolset:
      type: string
      enum:
        - core
        - code-execution
        - document
        - image
        - image-gen
        - video
        - viz
        - web
        - world-gen
      title: AgentToolset
      description: |-
        Available toolsets for agent tool selection.

        Each toolset represents a category of related tools that can be enabled
        together for an agent execution.
    AgentModel:
      type: string
      enum:
        - google-gemini-3-image
        - google-gemini-3-analysis
        - google-gemini-robotics-er
        - google-veo-3.1
        - microsoft-omniparser-v2
        - qwen-qwen3-vl-8b
        - meta-sam2
        - meta-sam3
        - meta-sam3d
        - depth-anything-3
        - vlm-dots-ocr
        - nvidia-cosmos-reason-2-8b
      title: AgentModel
      description: |-
        Available models for agent tool selection.

        Each model represents a specialized capability backed by a specific
        model deployment.  Multiple models can be selected simultaneously —
        pass a list and the tools are merged and deduplicated.

        Usage in vlmrun.yaml::

            model: vlmrun-orion-1:auto
            toolsets:
              - core
              - image
            models:
              - nvidia-cosmos-reason-2-8b
              - meta-sam3
    ValidationError:
      properties:
        loc:
          items:
            anyOf:
              - type: string
              - type: integer
          type: array
          title: Location
        msg:
          type: string
          title: Message
        type:
          type: string
          title: Error Type
        input:
          title: Input
        ctx:
          type: object
          title: Context
      type: object
      required:
        - loc
        - msg
        - type
      title: ValidationError
    MessageContent:
      properties:
        type:
          type: string
          enum:
            - text
            - image_url
            - video_url
            - audio_url
            - file_url
            - input_file
          title: Type
        text:
          anyOf:
            - type: string
            - type: 'null'
          title: Text
        image_url:
          anyOf:
            - $ref: '#/components/schemas/MessageContentImageUrl'
            - type: string
            - type: 'null'
          title: Image Url
        video_url:
          anyOf:
            - $ref: '#/components/schemas/MessageContentVideoUrl'
            - type: 'null'
        audio_url:
          anyOf:
            - $ref: '#/components/schemas/MessageContentAudioUrl'
            - type: 'null'
        file_url:
          anyOf:
            - $ref: '#/components/schemas/MessageContentFileUrl'
            - type: 'null'
        file_id:
          anyOf:
            - type: string
            - type: 'null'
          title: File Id
      type: object
      required:
        - type
      title: MessageContent
    ChatCompletionMessageFunctionToolCallParam:
      properties:
        id:
          type: string
          title: Id
        function:
          $ref: '#/components/schemas/Function'
        type:
          type: string
          const: function
          title: Type
      type: object
      required:
        - id
        - function
        - type
      title: ChatCompletionMessageFunctionToolCallParam
      description: A call to a function tool created by the model.
    InlineSkillSource:
      properties:
        type:
          type: string
          const: base64
          title: Type
          description: >-
            Encoding type for the inline skill data. Currently only 'base64' is
            supported.
          default: base64
        media_type:
          type: string
          title: Media Type
          description: MIME type of the skill bundle. Must be 'application/zip'.
          default: application/zip
        data:
          type: string
          title: Data
          description: Base64-encoded zip bundle containing the skill files.
      type: object
      required:
        - data
      title: InlineSkillSource
      description: |-
        Source payload for an inline skill bundle.

        Follows the OpenAI inline skill format::

            {
                "type": "base64",
                "media_type": "application/zip",
                "data": "<base64-encoded-zip>"
            }
    MessageContentImageUrl:
      properties:
        url:
          type: string
          title: Url
        detail:
          type: string
          enum:
            - auto
            - low
            - high
          title: Detail
          default: auto
      type: object
      required:
        - url
      title: MessageContentImageUrl
    MessageContentVideoUrl:
      properties:
        url:
          type: string
          title: Url
      type: object
      required:
        - url
      title: MessageContentVideoUrl
    MessageContentAudioUrl:
      properties:
        url:
          type: string
          title: Url
      type: object
      required:
        - url
      title: MessageContentAudioUrl
    MessageContentFileUrl:
      properties:
        url:
          type: string
          title: Url
      type: object
      required:
        - url
      title: MessageContentFileUrl
    Function:
      properties:
        arguments:
          type: string
          title: Arguments
        name:
          type: string
          title: Name
      type: object
      required:
        - arguments
        - name
      title: Function
      description: The function that the model called.
  securitySchemes:
    HTTPBearer:
      type: http
      scheme: bearer

````