AI API Structured Output Guide 2026: JSON Mode vs Function Calling vs Schema Constraints

Q: What is the difference between JSON mode and structured outputs?

JSON mode only guarantees that the response is syntactically valid JSON, not that it matches your schema. Structured outputs (strict schema mode) constrain decoding to your exact JSON Schema, so required fields, types, and enums are enforced at generation time. If you need specific keys and types, use strict structured outputs rather than plain JSON mode.

Q: Why does my LLM still return invalid JSON sometimes?

The most common causes are not enabling strict schema mode, hitting the max token limit mid-object so the JSON is truncated, or asking for markdown-fenced output that wraps the JSON in backticks. Enable strict structured outputs, set a generous max_tokens, and validate the parsed object against your schema before trusting it.

Published June 8, 2026 · 11 min read

If you've shipped anything with an LLM behind it, you've hit this wall: you ask for JSON, and roughly nineteen times out of twenty it's perfect. The twentieth time it wraps the object in ```json fences, adds a chatty preamble, drops a required field, or returns a number as a string. Your parser throws, your pipeline stalls, and now you're writing regex to fix a model's grammar. There's a better way, and in 2026 you actually have three of them. The catch is they're not interchangeable, and picking the wrong one is why a lot of "the model keeps breaking" tickets exist. Let's sort it out.

The Three Ways to Get Structured Data

Every major API exposes some mix of these. The names differ by provider, but the mechanics are the same:

Approach	What it guarantees	Best for
JSON mode	Valid JSON syntax only	Loose payloads, prototyping
Function / tool calling	Args matching a tool schema	Deciding which action to take
Structured outputs (strict schema)	Output matches your exact JSON Schema	Always-return-one-typed-object endpoints

The single biggest misconception: people reach for JSON mode expecting it to enforce their fields. It doesn't. JSON mode promises the response will parse as JSON. That's it. Whether it has the keys you asked for is still up to the model's mood.

JSON Mode: The Floor, Not the Ceiling

JSON mode is the simplest. You flip a flag and the model is constrained to emit syntactically valid JSON. Here's the OpenAI-compatible shape with curl:

curl https://api.kissapi.ai/v1/chat/completions \
  -H "Authorization: Bearer $KISSAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "response_format": { "type": "json_object" },
    "messages": [
      { "role": "system", "content": "Return a JSON object with keys: sentiment, confidence." },
      { "role": "user", "content": "This update broke my whole workflow." }
    ]
  }'

One thing the docs bury: with plain json_object mode, you must mention JSON in your prompt, or some providers will reject the request. And you still have to describe the shape you want in the prompt, because the mode won't do it for you. You'll get valid JSON. You might not get confidence as a float, or sentiment from your three allowed values. Validate after parsing, always.

Structured Outputs: When You Want Guarantees

This is the one most teams should be using for extraction, classification, and data-shaping endpoints. You hand the API a JSON Schema with strict: true, and the decoder is constrained so the output can only follow that schema. Required fields show up. Enums stay inside their allowed set. Types are honored.

curl https://api.kissapi.ai/v1/chat/completions \
  -H "Authorization: Bearer $KISSAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "ticket_triage",
        "strict": true,
        "schema": {
          "type": "object",
          "properties": {
            "priority": { "type": "string", "enum": ["low", "medium", "high", "urgent"] },
            "category": { "type": "string" },
            "summary":  { "type": "string" }
          },
          "required": ["priority", "category", "summary"],
          "additionalProperties": false
        }
      }
    },
    "messages": [
      { "role": "user", "content": "Prod is down, checkout returns 500 for all users." }
    ]
  }'

Two details that save you grief: set additionalProperties: false so the model can't sneak in extra keys, and list every property in required. Strict mode on most 2026 endpoints expects all properties to be required, so if a field is genuinely optional, model it as a union with null instead of leaving it out.

The Python version with validation

Even with strict mode, I validate. Networks truncate, models hit token caps, and defensive parsing costs almost nothing. Here's the pattern I actually ship:

import json
import os
from openai import OpenAI
from pydantic import BaseModel, ValidationError

client = OpenAI(
    api_key=os.environ["KISSAPI_KEY"],
    base_url="https://api.kissapi.ai/v1",
)

class Triage(BaseModel):
    priority: str
    category: str
    summary: str

SCHEMA = {
    "name": "ticket_triage",
    "strict": True,
    "schema": {
        "type": "object",
        "properties": {
            "priority": {"type": "string", "enum": ["low", "medium", "high", "urgent"]},
            "category": {"type": "string"},
            "summary":  {"type": "string"},
        },
        "required": ["priority", "category", "summary"],
        "additionalProperties": False,
    },
}

def triage(ticket: str) -> Triage:
    resp = client.chat.completions.create(
        model="gpt-5.4",
        max_tokens=400,
        response_format={"type": "json_schema", "json_schema": SCHEMA},
        messages=[{"role": "user", "content": ticket}],
    )
    raw = resp.choices[0].message.content
    try:
        return Triage(**json.loads(raw))
    except (json.JSONDecodeError, ValidationError) as e:
        # truncation or schema drift — log raw and retry or fall back
        raise RuntimeError(f"bad structured output: {e}\nraw={raw!r}")

Pydantic here is belt-and-suspenders. Strict mode gets you 99%+ compliance; the validator catches the truncation case where you blew past max_tokens and the JSON came back cut in half. That failure looks like valid intent but invalid syntax, and it's the bug people waste the most time on.

Function Calling: When the Model Decides

Function calling is not really about JSON formatting, even though it produces JSON. It's about letting the model choose whether to call a tool and which one. Use it when your agent might call search_orders, issue_refund, or just reply in plain text depending on the input.

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["KISSAPI_KEY"], base_url="https://api.kissapi.ai/v1")

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a city.",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

resp = client.chat.completions.create(
    model="claude-sonnet-4-6",
    tools=tools,
    tool_choice="auto",
    messages=[{"role": "user", "content": "Do I need an umbrella in Shanghai today?"}],
)

msg = resp.choices[0].message
if msg.tool_calls:
    for call in msg.tool_calls:
        print(call.function.name, call.function.arguments)  # JSON string
else:
    print(msg.content)

If you find yourself defining a single tool just to force a JSON shape, and tool_choice is always that one tool, stop. That's structured outputs wearing a costume. Switch to response_format with a schema. It's clearer, and you skip the empty-content edge cases that tool calls introduce.

Node.js: Forcing a Specific Tool

Sometimes you genuinely want function calling but need to force one tool every time, for example to always extract entities into a known shape:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.KISSAPI_KEY,
  baseURL: "https://api.kissapi.ai/v1",
});

const resp = await client.chat.completions.create({
  model: "gpt-5.4",
  tools: [{
    type: "function",
    function: {
      name: "extract_entities",
      parameters: {
        type: "object",
        properties: {
          people: { type: "array", items: { type: "string" } },
          orgs:   { type: "array", items: { type: "string" } },
        },
        required: ["people", "orgs"],
      },
    },
  }],
  tool_choice: { type: "function", function: { name: "extract_entities" } },
  messages: [{ role: "user", content: "Tim Cook met Satya Nadella at the Apple campus." }],
});

const args = JSON.parse(resp.choices[0].message.tool_calls[0].function.arguments);
console.log(args.people, args.orgs);

Which One Should You Actually Pick?

Quick rule: always returns one typed object? Use structured outputs. Model picks an action? Use function calling. Just prototyping and don't care about exact keys? JSON mode is fine. When in doubt, default to structured outputs with strict mode and a Pydantic/Zod validator behind it.

Cross-Model Reality Check

Here's the practical wrinkle in 2026: strict schema support is uneven across models. The frontier OpenAI and Google models enforce strict JSON Schema cleanly. Anthropic does structured work primarily through tool use. Several open models support JSON mode but not strict schema, so they'll happily return well-formed-but-wrong objects.

If you route across providers, that inconsistency turns into 2am pages. The fix is to keep one request shape and let a gateway normalize the differences. We built KissAPI to expose a single OpenAI-compatible endpoint across Claude, GPT, Gemini, and open models, so your response_format and tools payloads stay identical no matter which model answers. You change the model string, not your parsing code.

Before you commit to a model for a structured-output workload, it's worth pricing the request shape too, since strict schemas and tool definitions add input tokens on every call. The token counter and API cost calculator make that math quick.

One Endpoint, Every Model, Same JSON Schema

Create a free account at api.kissapi.ai/register and ship structured outputs across Claude, GPT, and Gemini without rewriting your parsing layer.

Start Free

Frequently Asked Questions

What is the difference between JSON mode and structured outputs?

JSON mode only guarantees the response is syntactically valid JSON, not that it matches your schema. Structured outputs (strict schema mode) constrain decoding to your exact JSON Schema, so required fields, types, and enums are enforced at generation time. If you need specific keys and types, use strict structured outputs rather than plain JSON mode.

Should I use function calling or structured outputs for getting JSON?

Use function calling when the model needs to decide whether and which tool to invoke. Use structured outputs (response_format with a json_schema) when you always want a single typed object back. For a pure extraction or classification endpoint that returns one object every time, structured outputs are simpler and more reliable than wrapping everything in a tool call.

Why does my LLM still return invalid JSON sometimes?

The usual culprits are not enabling strict schema mode, hitting the max token limit mid-object so the JSON is truncated, or asking for markdown-fenced output that wraps the JSON in backticks. Enable strict structured outputs, set a generous max_tokens, and validate the parsed object against your schema before trusting it.