BYOK Guide: Use Your Own API Key with Cursor, Cline & Claude Code (2026)

Every AI coding tool wants you on a subscription. Cursor Pro is $20/month. Windsurf is $15. Claude Code's Max plan runs $100-200. And they all throttle you when you hit their "fast request" limits.

There's a better way. Bring Your Own Key — BYOK — means you plug your own API key into these tools and pay only for what you use. No monthly fees. No throttling. No surprise rate limits at 3 AM when you're in the zone.

This guide covers the exact setup for the three most popular BYOK-compatible coding tools in 2026: Cursor, Cline, and Claude Code. I'll include the real costs, the gotchas nobody mentions, and the configuration that actually works.

Why BYOK Saves You Money (The Math)

Let's do the math on a typical developer month. Say you make 150 coding requests per day, averaging 2K input tokens and 1K output tokens each. That's about 9M input tokens and 4.5M output tokens per month.

With Claude Sonnet 4.6 pricing ($3/$15 per million tokens):

ApproachMonthly CostRate Limits
Cursor Pro subscription$20/mo500 fast requests, then slow
Claude Code Max (5x)$100/moGenerous but capped
BYOK (direct Anthropic)~$94.50API rate limits only
BYOK (API gateway)~$47-70Higher limits, failover

Wait — BYOK with direct Anthropic is more expensive than the subscription? Sometimes, yes. Cursor Pro at $20/month is subsidized. They're losing money on heavy users to build market share. But here's the thing: those 500 fast requests run out in 3-4 days if you're coding seriously. After that, you're waiting 30-60 seconds per response. That's not a tool — that's a patience test.

BYOK through an API gateway hits the sweet spot. You get the full speed of direct API access, often at lower per-token rates, and you're never throttled by an arbitrary request counter.

What You Need Before Starting

For any BYOK setup, you need two things:

  1. An API key from a provider (Anthropic, OpenAI, or an API gateway)
  2. The base URL for that provider's API endpoint

If you're using Anthropic directly, your base URL is https://api.anthropic.com. If you're using an OpenAI-compatible gateway, it'll be something like https://api.yourgateway.com/v1.

The gateway approach has a practical advantage: one API key gives you access to Claude, GPT-5, GPT-5.4, DeepSeek V4, Qwen 3.5, and whatever drops next week. You don't need separate accounts with every provider.

Setup 1: Cursor with Your Own API Key

Cursor is the most popular AI coding IDE right now, and its BYOK support is solid. Here's the setup:

Step 1: Open Cursor Settings

Press Cmd+Shift+J (Mac) or Ctrl+Shift+J (Windows/Linux) to open Settings. Navigate to the "Models" tab.

Step 2: Add Your API Key

Scroll down to "OpenAI API Key" and paste your key. If you're using a custom endpoint (not OpenAI directly), toggle "Override OpenAI Base URL" and enter your gateway URL:

# For an OpenAI-compatible gateway:
Base URL: https://api.kissapi.ai/v1
API Key:  sk-your-api-key-here

Step 3: Select Your Model

Under "Model Names," add the models you want to use. For Claude through an OpenAI-compatible gateway:

claude-sonnet-4-6
claude-opus-4-6
gpt-5.4
deepseek-v4

Step 4: Disable Cursor's Built-in Models

This is the step people miss. If you don't disable Cursor's default models, it'll still route some requests through their servers (and count against your subscription limits). Toggle off the built-in models you don't want.

Gotcha: Cursor's "Tab" autocomplete and some background features still use their own models regardless of your API key settings. BYOK in Cursor covers Chat and Composer — not everything.

Setup 2: Cline with Your Own API Key

Cline (formerly Claude Dev) is the open-source VS Code extension that gives you full agentic coding with your own API key. It's the purest BYOK experience — there's no subscription tier at all. You always bring your own key.

Step 1: Install Cline

Search "Cline" in the VS Code extensions marketplace and install it. It'll appear in your sidebar.

Step 2: Configure the API Provider

Click the Cline icon in the sidebar, then the gear icon. You'll see provider options:

For maximum flexibility, choose "OpenAI Compatible":

# Cline OpenAI-Compatible Settings:
Base URL:  https://api.kissapi.ai/v1
API Key:   sk-your-api-key-here
Model ID:  claude-sonnet-4-6

Step 3: Set a Spending Limit

This is the feature that makes Cline special for BYOK users. You can set a per-task spending limit — say, $2 per task. If Cline's agentic loop burns through that amount, it stops and asks before continuing. No runaway API bills.

# In Cline settings:
Auto-approve limit: $2.00 per task
Max requests per task: 50

Pro tip: Start with Claude Sonnet 4.6 for Cline tasks. Opus is tempting, but Cline's agentic loop makes many API calls per task. At $15/$75 per million tokens (Opus), a complex refactoring task can cost $5-10. Sonnet handles 90% of coding tasks just as well at 1/5 the price.

Setup 3: Claude Code with a Custom API Endpoint

Claude Code is Anthropic's terminal-based coding agent. It's powerful — arguably the best agentic coder available — but it defaults to Anthropic's API. Here's how to point it at your own endpoint.

Option A: Environment Variables

The simplest approach. Add these to your shell profile (~/.zshrc or ~/.bashrc):

# For Anthropic-format endpoints:
export ANTHROPIC_BASE_URL=https://api.kissapi.ai
export ANTHROPIC_API_KEY=sk-your-api-key-here

# Reload your shell:
source ~/.zshrc

Now run claude and it'll use your custom endpoint.

Option B: OpenAI-Compatible Mode

If your gateway uses the OpenAI format instead of Anthropic's native format:

export OPENAI_BASE_URL=https://api.kissapi.ai/v1
export OPENAI_API_KEY=sk-your-api-key-here

Then launch Claude Code with:

claude --model claude-sonnet-4-6

Option C: The Settings File

For a permanent setup, create or edit ~/.claude/settings.json:

{
  "apiBaseUrl": "https://api.kissapi.ai",
  "model": "claude-sonnet-4-6"
}

Gotcha: Claude Code's /cost command tracks spending per session. Keep an eye on it during long agentic runs. A single "refactor this entire module" task can burn through 500K+ tokens if the codebase is large.

The Real Cost Breakdown: BYOK vs. Subscriptions

I tracked my actual API usage across all three tools for a month. Here's what it looked like:

ToolRequests/DayTokens/MonthBYOK CostSubscription Cost
Cursor (Chat + Composer)~80~6M in / 3M out$63$20 (throttled)
Cline (agentic tasks)~20~4M in / 2M out$42N/A (BYOK only)
Claude Code (terminal)~30~5M in / 2M out$45$100 (Max plan)

Total BYOK cost: about $150/month. Total subscription cost for equivalent usage: $120+ (and you'd still hit throttling on Cursor). The BYOK cost is higher in raw dollars, but you get unthrottled access, model flexibility, and no vendor lock-in.

If you're a lighter user — say 50 requests per day total — BYOK drops to $30-50/month. That's where it clearly wins over subscriptions.

Three Ways to Cut Your BYOK Costs Further

1. Route by Complexity

Not every request needs Sonnet. Use a model router pattern: send simple completions to Haiku ($0.80/$4 per million), medium tasks to Sonnet ($3/$15), and only the hard stuff to Opus ($15/$75). This alone can cut costs 40-60%.

2. Use Prompt Caching

If you're working in the same codebase all day, your system prompt and file context barely change between requests. Prompt caching (supported by Anthropic and most gateways) can reduce input token costs by up to 90% on repeated context.

3. Set Hard Limits

Every tool mentioned here supports spending limits. Use them. Set a daily budget of $5-10 and you'll never wake up to a surprise bill. Cline has this built in. For Cursor and Claude Code, track usage through your API dashboard.

Get Your BYOK API Key

KissAPI gives you one API key for Claude, GPT-5, DeepSeek, Qwen, and 200+ models. OpenAI-compatible format. Works with Cursor, Cline, and Claude Code out of the box.

Start Free →

Common Issues and Fixes

"Model not found" error in Cursor

Make sure you've added the exact model ID to Cursor's model list. It's claude-sonnet-4-6, not claude-sonnet-4.6 or claude-4-sonnet. Check your gateway's model list for the exact IDs.

Cline shows $0.00 cost per request

This happens when using OpenAI-compatible mode — Cline can't always parse the token counts from non-Anthropic responses. The requests are still working and costing tokens. Check your gateway dashboard for actual usage.

Claude Code ignores ANTHROPIC_BASE_URL

Make sure you've reloaded your shell after setting the environment variable. Run echo $ANTHROPIC_BASE_URL to verify it's set. If you're using a version manager like nvm, the variable might not propagate to all shell sessions.

Slow responses compared to subscription

This usually means your gateway is routing through a congested region. Try a gateway with multiple provider endpoints and automatic failover. Response times should be under 500ms for the first token.

Which Tool Should You BYOK?

Quick decision framework:

Nothing stops you from using all three. I use Cursor for quick edits, Cline for medium refactoring, and Claude Code for large-scale changes — all pointed at the same API key.