Claude Code Desktop Custom API Endpoint Setup Guide (2026)

Published May 15, 2026 · 9 min read

Claude Code Desktop custom API endpoint workflow

Claude Code Desktop is moving from a niche terminal tool into the daily workflow of serious developers. The desktop app makes it easier to keep long coding sessions alive, review changes, and jump between projects without babysitting a shell. But there is one setup detail worth getting right early: your API endpoint.

If you only use the default Anthropic endpoint, life is simple until it isn't. You hit rate limits, your regional billing fails, or one model gets expensive for routine edits. A custom endpoint gives you a cleaner escape hatch. You can route Claude Code Desktop through a gateway, use different models for different jobs, and keep your coding agent budget from wandering off a cliff.

Target keyword: Claude Code Desktop custom API endpoint 2026. This guide assumes you already know what Claude Code does and want a practical setup that survives real work.

When a custom endpoint makes sense

Don't add extra infrastructure just because it sounds clever. For a solo developer doing a few prompts per day, the official endpoint is fine. A custom endpoint starts paying for itself when you care about at least one of these:

Cost control. Coding agents burn tokens fast because they read files, plan edits, run tests, and retry.
Model flexibility. You may want Opus for architecture, Sonnet for edits, and a cheaper model for summarizing logs.
Regional access. Some teams can't easily bill or access every official provider directly.
Fallbacks. If one model is rate limited, your whole workflow shouldn't stop.
Auditability. A gateway lets you see per-key spend, failed requests, and model usage in one place.

This is where an OpenAI- or Anthropic-compatible gateway such as KissAPI fits naturally. You keep the client workflow mostly the same, but point it at a base URL you control.

The mental model: base URL, API key, model name

Most setup bugs come from mixing three things together:

Setting	What it means	Common mistake
Base URL	The API server Claude Code calls	Using `/v1` twice or missing it entirely
API key	Your credential for that endpoint	Using an Anthropic key against a gateway URL
Model name	The model ID understood by the endpoint	Sending a friendly display name instead of the exact ID

Write those three values down before touching the app. If your endpoint is Anthropic-compatible, you normally configure an Anthropic-style base URL and key. If it is OpenAI-compatible, you may need a compatibility layer or tool setting that supports OpenAI-style chat completions. The boring detail matters.

Step 1: Test the endpoint before opening Claude Code Desktop

Do not debug inside the desktop app first. Use curl. It gives you a clean answer: the endpoint works, the key works, and the model name exists.

export API_KEY="your-api-key"
export BASE_URL="https://api.kissapi.ai/v1"

curl "$BASE_URL/models" \
  -H "Authorization: Bearer $API_KEY"

If your gateway exposes an OpenAI-compatible chat endpoint, test a tiny completion:

curl "$BASE_URL/chat/completions" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [
      {"role": "user", "content": "Reply with exactly: endpoint ok"}
    ],
    "max_tokens": 20
  }'

For Anthropic-style endpoints, the equivalent looks like this:

curl "https://api.kissapi.ai/v1/messages" \
  -H "x-api-key: $API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 20,
    "messages": [
      {"role": "user", "content": "Reply with exactly: endpoint ok"}
    ]
  }'

If this fails, fix it here. A desktop UI will only hide the useful error under a nicer wrapper.

Step 2: Configure Claude Code Desktop

The exact UI changes over time, but the working pattern is the same. Open settings, find the API or provider section, and set:

API key: the key from your provider or gateway dashboard
Base URL: your endpoint, for example https://api.kissapi.ai or https://api.kissapi.ai/v1 depending on the client field
Default model: a real model ID such as claude-sonnet-4-6

Watch the /v1 suffix. Some apps ask for the root URL and append /v1/messages internally. Others ask for the full API base. If your first request shows a path like /v1/v1/messages, you found the problem.

Step 3: Use the right model for the job

My strong opinion: don't run every coding-agent task on the most expensive model. It feels safe, but it is lazy routing. Use a small routing policy:

Task	Recommended model tier	Why
Repo scan, file summaries, log cleanup	Cheap / fast model	Low reasoning load, lots of tokens
Bug fix in known files	Sonnet-class model	Best daily balance
Cross-file refactor or migration	Opus-class model	Planning errors are expensive
PR review	Sonnet or Opus depending on risk	Security-sensitive code deserves better review

If your gateway supports aliases, create names like coding-fast, coding-default, and coding-deep. Then your tool configuration stays stable while you change the backend routing later.

Python and Node.js sanity tests

These tests are useful when a teammate says “Claude Code is broken” but the real issue is their key, proxy, or model name.

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="https://api.kissapi.ai/v1"
)

resp = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Return only: python ok"}],
    max_tokens=20,
)
print(resp.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.API_KEY,
  baseURL: "https://api.kissapi.ai/v1",
});

const resp = await client.chat.completions.create({
  model: "claude-sonnet-4-6",
  messages: [{ role: "user", content: "Return only: node ok" }],
  max_tokens: 20,
});

console.log(resp.choices[0].message.content);

Common errors and fixes

401 Unauthorized

Your key doesn't match the endpoint. This often happens when someone pastes an official Anthropic key into a third-party gateway config. Generate a key from the same service that owns the base URL.

404 Not Found

Usually a URL shape problem. Check whether the app appended /v1/messages, /messages, or /chat/completions. Then adjust the base URL.

Model not found

The model ID is wrong or not enabled on your account. Call /models and copy the exact ID. Don't guess.

429 Rate limit

Lower concurrency first. Coding agents can accidentally open several parallel tool calls. Add retries with exponential backoff, but don't blindly retry expensive prompts. If the same request fails three times, switch model or pause.

Budget guardrails I would actually use

Set a daily cap for the key used by Claude Code Desktop. Not monthly. Daily. Coding-agent mistakes are bursty: one bad loop can spend a week's budget before you notice. Also split keys by environment. Your desktop app, CI reviewer, and production app should not share the same API key.

For teams, add a short system instruction that tells the agent to summarize large files before editing and to avoid sending unrelated directories. It sounds small, but context discipline is the difference between a $40 month and a $400 month.

Use Claude Code with one flexible API key

KissAPI gives you Claude, GPT, Gemini, and other models through a single gateway, with pay-as-you-go credits and OpenAI-compatible tooling.

Start Free →

Final checklist

Test /models with curl.
Run one tiny completion outside Claude Code Desktop.
Configure the app with the matching base URL and key.
Start with a Sonnet-class model, not the most expensive model.
Add daily budget caps before doing long refactors.

The custom endpoint setup isn't glamorous. That's the point. Once it's boring, you can focus on the work: cleaner diffs, faster reviews, and fewer surprise bills.