Codex CLI Custom API Key Setup Guide (2026): Use Any OpenAI-Compatible Endpoint
Codex CLI is getting popular for one simple reason: it feels fast. You stay in the terminal, point it at a repo, and ask for real work. No browser tab circus. No IDE bloat. But the default setup pushes you toward OpenAI-first auth, and that’s not always what you want.
Maybe you want one billing layer for every coding tool. Maybe you need a provider that works better in your region. Maybe you just don’t want Claude Code, OpenCode, Aider, and Codex all wired to different keys and dashboards. Fair enough.
This guide shows how to run Codex CLI against a custom OpenAI-compatible endpoint in 2026, what to put in ~/.codex/config.toml, when plain environment variables are enough, and the three mistakes that waste the most time.
| Setup style | Best for | Main knobs |
|---|---|---|
| Env vars only | Fastest first test | OPENAI_API_KEY, OPENAI_BASE_URL |
config.toml provider | Daily usage | model_provider, base_url, env_key, wire_api |
| Profiles | Switching between providers | --profile plus provider blocks |
What Codex CLI Actually Supports
Recent Codex docs make this clearer than older blog posts do: Codex can talk to providers that support the Responses API or the older Chat Completions API. Responses is the better target. Chat still works with some providers, but Codex has already marked that path as legacy.
That means your provider needs to do more than return text. For real Codex usage, it should handle the tool-calling flow cleanly, stream reliably, and accept model IDs exactly the way the endpoint expects.
Short version: if your provider is OpenAI-compatible and supports the Responses API well, Codex CLI setup is easy. If it only sort of looks compatible, you’ll get weird failures that look like auth bugs but aren’t.
Option 1: Fastest Setup with Environment Variables
If your endpoint is close to OpenAI’s default API shape, start here. Export a custom base URL and API key, then launch Codex normally.
export OPENAI_API_KEY="your-api-key"
export OPENAI_BASE_URL="https://your-endpoint.example/v1"
codex
That’s the quickest path. It’s also the one most people should try first.
If you want to verify the endpoint before involving Codex, test it with curl. Do this first. It saves a lot of blind debugging.
curl https://your-endpoint.example/v1/responses \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4",
"input": "Say hello from a custom endpoint."
}'
If that returns a sane response, your API key, base URL, and model naming are probably fine.
Option 2: Proper Multi-Provider Setup with config.toml
If you switch between providers, or you want cleaner profile-based control, use ~/.codex/config.toml. This is the setup I’d recommend for anyone using Codex seriously.
Codex supports a model_provider key and a model_providers map. Each provider can define its own base URL, auth env var, retry behavior, and wire protocol.
Here’s a clean example for a custom endpoint:
model = "gpt-5.4"
model_provider = "customapi"
[model_providers.customapi]
name = "Custom API"
base_url = "https://your-endpoint.example/v1"
env_key = "CUSTOM_API_KEY"
wire_api = "responses"
Then export the key:
export CUSTOM_API_KEY="your-api-key"
codex
This is nicer than overloading OPENAI_API_KEY if you also use direct OpenAI elsewhere. It keeps your shell less confusing.
When to use wire_api = "responses" vs "chat"
- Use
responseswhen your provider supports the modern OpenAI Responses API. - Use
chatonly if the provider explicitly expects Chat Completions and doesn’t support Responses cleanly.
My advice: don’t default to chat just because some random setup guide says so. If your provider supports Responses, use it.
Testing the Endpoint Outside Codex
Before you blame Codex, test the same endpoint in a tiny script. One in Python, one in Node.js. If these work and Codex doesn’t, the issue is usually model naming or provider config, not your API key.
Python
from openai import OpenAI
client = OpenAI(
api_key="your-api-key",
base_url="https://your-endpoint.example/v1"
)
resp = client.responses.create(
model="gpt-5.4",
input="Summarize why response streaming matters for coding agents."
)
print(resp.output_text)
Node.js
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.CUSTOM_API_KEY,
baseURL: "https://your-endpoint.example/v1"
});
const resp = await client.responses.create({
model: "gpt-5.4",
input: "Explain what makes a good coding model endpoint."
});
console.log(resp.output_text);
If both scripts succeed, you’ve confirmed the endpoint works independently of Codex CLI.
The 3 Setup Mistakes That Cause Most Failures
1. Wrong model ID
This is the big one. Some providers expose plain IDs like gpt-5.4. Others expect prefixed names like openai/gpt-5.4 or anthropic/claude-sonnet-4.6. Codex won’t guess for you. Use the exact model string your endpoint expects.
2. Using the wrong wire API
If the provider expects Responses and you configure chat, things break in annoying ways. Sometimes you’ll get a direct error. Sometimes Codex will start, then fail during tool use or streaming. Same problem in reverse if the provider only supports Chat Completions.
3. Assuming every OpenAI-compatible API is Codex-compatible
Not true. Basic chat compatibility is easy. Stable coding-agent compatibility is harder. Codex needs solid streaming, tool support, and predictable model behavior under long prompts. A flaky endpoint might look fine in curl and still feel awful in real use.
A Better Daily Setup: Use Profiles
If you bounce between providers, create profiles. That way you can switch stacks without editing env vars every time.
model = "gpt-5.4"
model_provider = "openai"
[profiles.customapi]
model = "gpt-5.4"
model_provider = "customapi"
[model_providers.customapi]
name = "Custom API"
base_url = "https://your-endpoint.example/v1"
env_key = "CUSTOM_API_KEY"
wire_api = "responses"
Then run:
codex --profile customapi
That’s cleaner, especially if you want one profile for frontier OpenAI models, one for a cheaper endpoint, and one for local OSS models.
Should You Use a Custom Endpoint for Codex CLI?
Usually, yes.
If you already use an OpenAI-compatible gateway for app traffic, there’s no reason Codex CLI needs to live on a separate island. A good gateway gives you unified billing, easier regional access, and faster switching between models. KissAPI is one option here if you want a single endpoint for GPT, Claude, Gemini, and other models without rewriting your toolchain.
That said, I wouldn’t use a weak aggregator just because it’s cheap. For coding agents, reliability matters more than shaving a few cents off a session. One bad diff costs more than a slightly pricier request.
Want One Endpoint for Codex, Aider, Cursor, and More?
Start free with KissAPI and connect your coding tools through one OpenAI-compatible endpoint.
Start FreeFinal Checklist
- Confirm your endpoint supports the Responses API if possible
- Test the base URL and key with curl before launching Codex
- Use the exact model ID your provider expects
- Use
config.tomlif you want multiple providers or cleaner profiles - Don’t judge compatibility by “hello world” alone — try a real code-edit task
That’s the whole game. Most Codex CLI setup problems are not mysterious. They come from the wrong model string, the wrong wire protocol, or a provider that claims compatibility but only covers the easy part.