Claude Code Desktop Custom API Endpoint Setup Guide (2026)
Claude Code Desktop is moving from a niche terminal tool into the daily workflow of serious developers. The desktop app makes it easier to keep long coding sessions alive, review changes, and jump between projects without babysitting a shell. But there is one setup detail worth getting right early: your API endpoint.
If you only use the default Anthropic endpoint, life is simple until it isn't. You hit rate limits, your regional billing fails, or one model gets expensive for routine edits. A custom endpoint gives you a cleaner escape hatch. You can route Claude Code Desktop through a gateway, use different models for different jobs, and keep your coding agent budget from wandering off a cliff.
Target keyword: Claude Code Desktop custom API endpoint 2026. This guide assumes you already know what Claude Code does and want a practical setup that survives real work.
When a custom endpoint makes sense
Don't add extra infrastructure just because it sounds clever. For a solo developer doing a few prompts per day, the official endpoint is fine. A custom endpoint starts paying for itself when you care about at least one of these:
- Cost control. Coding agents burn tokens fast because they read files, plan edits, run tests, and retry.
- Model flexibility. You may want Opus for architecture, Sonnet for edits, and a cheaper model for summarizing logs.
- Regional access. Some teams can't easily bill or access every official provider directly.
- Fallbacks. If one model is rate limited, your whole workflow shouldn't stop.
- Auditability. A gateway lets you see per-key spend, failed requests, and model usage in one place.
This is where an OpenAI- or Anthropic-compatible gateway such as KissAPI fits naturally. You keep the client workflow mostly the same, but point it at a base URL you control.
The mental model: base URL, API key, model name
Most setup bugs come from mixing three things together:
| Setting | What it means | Common mistake |
|---|---|---|
| Base URL | The API server Claude Code calls | Using /v1 twice or missing it entirely |
| API key | Your credential for that endpoint | Using an Anthropic key against a gateway URL |
| Model name | The model ID understood by the endpoint | Sending a friendly display name instead of the exact ID |
Write those three values down before touching the app. If your endpoint is Anthropic-compatible, you normally configure an Anthropic-style base URL and key. If it is OpenAI-compatible, you may need a compatibility layer or tool setting that supports OpenAI-style chat completions. The boring detail matters.
Step 1: Test the endpoint before opening Claude Code Desktop
Do not debug inside the desktop app first. Use curl. It gives you a clean answer: the endpoint works, the key works, and the model name exists.
export API_KEY="your-api-key"
export BASE_URL="https://api.kissapi.ai/v1"
curl "$BASE_URL/models" \
-H "Authorization: Bearer $API_KEY"
If your gateway exposes an OpenAI-compatible chat endpoint, test a tiny completion:
curl "$BASE_URL/chat/completions" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"messages": [
{"role": "user", "content": "Reply with exactly: endpoint ok"}
],
"max_tokens": 20
}'
For Anthropic-style endpoints, the equivalent looks like this:
curl "https://api.kissapi.ai/v1/messages" \
-H "x-api-key: $API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 20,
"messages": [
{"role": "user", "content": "Reply with exactly: endpoint ok"}
]
}'
If this fails, fix it here. A desktop UI will only hide the useful error under a nicer wrapper.
Step 2: Configure Claude Code Desktop
The exact UI changes over time, but the working pattern is the same. Open settings, find the API or provider section, and set:
- API key: the key from your provider or gateway dashboard
- Base URL: your endpoint, for example
https://api.kissapi.aiorhttps://api.kissapi.ai/v1depending on the client field - Default model: a real model ID such as
claude-sonnet-4-6
Watch the /v1 suffix. Some apps ask for the root URL and append /v1/messages internally. Others ask for the full API base. If your first request shows a path like /v1/v1/messages, you found the problem.
Step 3: Use the right model for the job
My strong opinion: don't run every coding-agent task on the most expensive model. It feels safe, but it is lazy routing. Use a small routing policy:
| Task | Recommended model tier | Why |
|---|---|---|
| Repo scan, file summaries, log cleanup | Cheap / fast model | Low reasoning load, lots of tokens |
| Bug fix in known files | Sonnet-class model | Best daily balance |
| Cross-file refactor or migration | Opus-class model | Planning errors are expensive |
| PR review | Sonnet or Opus depending on risk | Security-sensitive code deserves better review |
If your gateway supports aliases, create names like coding-fast, coding-default, and coding-deep. Then your tool configuration stays stable while you change the backend routing later.
Python and Node.js sanity tests
These tests are useful when a teammate says “Claude Code is broken” but the real issue is their key, proxy, or model name.
from openai import OpenAI
client = OpenAI(
api_key="your-api-key",
base_url="https://api.kissapi.ai/v1"
)
resp = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Return only: python ok"}],
max_tokens=20,
)
print(resp.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.API_KEY,
baseURL: "https://api.kissapi.ai/v1",
});
const resp = await client.chat.completions.create({
model: "claude-sonnet-4-6",
messages: [{ role: "user", content: "Return only: node ok" }],
max_tokens: 20,
});
console.log(resp.choices[0].message.content);
Common errors and fixes
401 Unauthorized
Your key doesn't match the endpoint. This often happens when someone pastes an official Anthropic key into a third-party gateway config. Generate a key from the same service that owns the base URL.
404 Not Found
Usually a URL shape problem. Check whether the app appended /v1/messages, /messages, or /chat/completions. Then adjust the base URL.
Model not found
The model ID is wrong or not enabled on your account. Call /models and copy the exact ID. Don't guess.
429 Rate limit
Lower concurrency first. Coding agents can accidentally open several parallel tool calls. Add retries with exponential backoff, but don't blindly retry expensive prompts. If the same request fails three times, switch model or pause.
Budget guardrails I would actually use
Set a daily cap for the key used by Claude Code Desktop. Not monthly. Daily. Coding-agent mistakes are bursty: one bad loop can spend a week's budget before you notice. Also split keys by environment. Your desktop app, CI reviewer, and production app should not share the same API key.
For teams, add a short system instruction that tells the agent to summarize large files before editing and to avoid sending unrelated directories. It sounds small, but context discipline is the difference between a $40 month and a $400 month.
Use Claude Code with one flexible API key
KissAPI gives you Claude, GPT, Gemini, and other models through a single gateway, with pay-as-you-go credits and OpenAI-compatible tooling.
Start Free →Final checklist
- Test
/modelswith curl. - Run one tiny completion outside Claude Code Desktop.
- Configure the app with the matching base URL and key.
- Start with a Sonnet-class model, not the most expensive model.
- Add daily budget caps before doing long refactors.
The custom endpoint setup isn't glamorous. That's the point. Once it's boring, you can focus on the work: cleaner diffs, faster reviews, and fewer surprise bills.