Codex CLI Custom API Key Setup Guide (2026): Use Any OpenAI-Compatible Endpoint

Published April 8, 2026 · 9 min read

Codex CLI is getting popular for one simple reason: it feels fast. You stay in the terminal, point it at a repo, and ask for real work. No browser tab circus. No IDE bloat. But the default setup pushes you toward OpenAI-first auth, and that’s not always what you want.

Maybe you want one billing layer for every coding tool. Maybe you need a provider that works better in your region. Maybe you just don’t want Claude Code, OpenCode, Aider, and Codex all wired to different keys and dashboards. Fair enough.

This guide shows how to run Codex CLI against a custom OpenAI-compatible endpoint in 2026, what to put in ~/.codex/config.toml, when plain environment variables are enough, and the three mistakes that waste the most time.

Setup style	Best for	Main knobs
Env vars only	Fastest first test	`OPENAI_API_KEY`, `OPENAI_BASE_URL`
`config.toml` provider	Daily usage	`model_provider`, `base_url`, `env_key`, `wire_api`
Profiles	Switching between providers	`--profile` plus provider blocks

What Codex CLI Actually Supports

Recent Codex docs make this clearer than older blog posts do: Codex can talk to providers that support the Responses API or the older Chat Completions API. Responses is the better target. Chat still works with some providers, but Codex has already marked that path as legacy.

That means your provider needs to do more than return text. For real Codex usage, it should handle the tool-calling flow cleanly, stream reliably, and accept model IDs exactly the way the endpoint expects.

Short version: if your provider is OpenAI-compatible and supports the Responses API well, Codex CLI setup is easy. If it only sort of looks compatible, you’ll get weird failures that look like auth bugs but aren’t.

Option 1: Fastest Setup with Environment Variables

If your endpoint is close to OpenAI’s default API shape, start here. Export a custom base URL and API key, then launch Codex normally.

export OPENAI_API_KEY="your-api-key"
export OPENAI_BASE_URL="https://your-endpoint.example/v1"

codex

That’s the quickest path. It’s also the one most people should try first.

If you want to verify the endpoint before involving Codex, test it with curl. Do this first. It saves a lot of blind debugging.

curl https://your-endpoint.example/v1/responses \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "input": "Say hello from a custom endpoint."
  }'

If that returns a sane response, your API key, base URL, and model naming are probably fine.

Option 2: Proper Multi-Provider Setup with config.toml

If you switch between providers, or you want cleaner profile-based control, use ~/.codex/config.toml. This is the setup I’d recommend for anyone using Codex seriously.

Codex supports a model_provider key and a model_providers map. Each provider can define its own base URL, auth env var, retry behavior, and wire protocol.

Here’s a clean example for a custom endpoint:

model = "gpt-5.4"
model_provider = "customapi"

[model_providers.customapi]
name = "Custom API"
base_url = "https://your-endpoint.example/v1"
env_key = "CUSTOM_API_KEY"
wire_api = "responses"

Then export the key:

export CUSTOM_API_KEY="your-api-key"
codex

This is nicer than overloading OPENAI_API_KEY if you also use direct OpenAI elsewhere. It keeps your shell less confusing.

When to use `wire_api = "responses"` vs `"chat"`

Use responses when your provider supports the modern OpenAI Responses API.
Use chat only if the provider explicitly expects Chat Completions and doesn’t support Responses cleanly.

My advice: don’t default to chat just because some random setup guide says so. If your provider supports Responses, use it.

Testing the Endpoint Outside Codex

Before you blame Codex, test the same endpoint in a tiny script. One in Python, one in Node.js. If these work and Codex doesn’t, the issue is usually model naming or provider config, not your API key.

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="https://your-endpoint.example/v1"
)

resp = client.responses.create(
    model="gpt-5.4",
    input="Summarize why response streaming matters for coding agents."
)

print(resp.output_text)

Node.js

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.CUSTOM_API_KEY,
  baseURL: "https://your-endpoint.example/v1"
});

const resp = await client.responses.create({
  model: "gpt-5.4",
  input: "Explain what makes a good coding model endpoint."
});

console.log(resp.output_text);

If both scripts succeed, you’ve confirmed the endpoint works independently of Codex CLI.

The 3 Setup Mistakes That Cause Most Failures

1. Wrong model ID

This is the big one. Some providers expose plain IDs like gpt-5.4. Others expect prefixed names like openai/gpt-5.4 or anthropic/claude-sonnet-4.6. Codex won’t guess for you. Use the exact model string your endpoint expects.

2. Using the wrong wire API

If the provider expects Responses and you configure chat, things break in annoying ways. Sometimes you’ll get a direct error. Sometimes Codex will start, then fail during tool use or streaming. Same problem in reverse if the provider only supports Chat Completions.

3. Assuming every OpenAI-compatible API is Codex-compatible

Not true. Basic chat compatibility is easy. Stable coding-agent compatibility is harder. Codex needs solid streaming, tool support, and predictable model behavior under long prompts. A flaky endpoint might look fine in curl and still feel awful in real use.

A Better Daily Setup: Use Profiles

If you bounce between providers, create profiles. That way you can switch stacks without editing env vars every time.

model = "gpt-5.4"
model_provider = "openai"

[profiles.customapi]
model = "gpt-5.4"
model_provider = "customapi"

[model_providers.customapi]
name = "Custom API"
base_url = "https://your-endpoint.example/v1"
env_key = "CUSTOM_API_KEY"
wire_api = "responses"

Then run:

codex --profile customapi

That’s cleaner, especially if you want one profile for frontier OpenAI models, one for a cheaper endpoint, and one for local OSS models.

Should You Use a Custom Endpoint for Codex CLI?

Usually, yes.

If you already use an OpenAI-compatible gateway for app traffic, there’s no reason Codex CLI needs to live on a separate island. A good gateway gives you unified billing, easier regional access, and faster switching between models. KissAPI is one option here if you want a single endpoint for GPT, Claude, Gemini, and other models without rewriting your toolchain.

That said, I wouldn’t use a weak aggregator just because it’s cheap. For coding agents, reliability matters more than shaving a few cents off a session. One bad diff costs more than a slightly pricier request.

Want One Endpoint for Codex, Aider, Cursor, and More?

Start free with KissAPI and connect your coding tools through one OpenAI-compatible endpoint.

Start Free

Final Checklist

Confirm your endpoint supports the Responses API if possible
Test the base URL and key with curl before launching Codex
Use the exact model ID your provider expects
Use config.toml if you want multiple providers or cleaner profiles
Don’t judge compatibility by “hello world” alone — try a real code-edit task

That’s the whole game. Most Codex CLI setup problems are not mysterious. They come from the wrong model string, the wrong wire protocol, or a provider that claims compatibility but only covers the easy part.