Reference

Claude API Error Codes Reference: Complete Troubleshooting Guide

Every Claude API error code, what causes it, how to handle it, and what your retry logic should look like. The reference every enterprise developer building on Claude needs bookmarked.

The Claude API returns standard HTTP status codes. Some errors are transient and should be retried with backoff. Others indicate a permanent problem in your request that will fail on every retry. Mixing these up is one of the most common โ€” and costly โ€” mistakes in Claude API integration.

This reference covers every error code you'll encounter in production, with the cause, the correct handling strategy, and working Python code for production-grade retry logic. See also the Claude API Enterprise Guide and our breakdown of Claude rate limiting and scaling strategies.

๐Ÿ”‘ The Golden Rule of API Error Handling
4xx errors (except 429) are your fault. Fix the request before retrying. 5xx errors are Anthropic's fault. Retry with exponential backoff. 429 is a rate limit โ€” back off and retry. Never retry a 400 or 401 โ€” you'll just hit the same wall.

Authentication & Authorisation Errors

401 authentication_error Do Not Retry
Cause

Missing, invalid, or revoked API key. The x-api-key header is absent, malformed, or the key has been deleted from your Anthropic console.

Fix

Verify the API key exists in your Anthropic console. Confirm it's being passed in the correct header. Ensure no whitespace or newline characters are included in the key string. Rotate if compromised.

403 permission_error Do Not Retry
Cause

The API key is valid, but doesn't have permission to access the requested resource. Common when using a workspace-scoped key that lacks access to a specific model, or when accessing beta features not enabled for your account tier.

Fix

Check the permissions assigned to your API key in the Anthropic console. For beta features, ensure you've opted in via the correct beta header and that your account tier supports the feature. Contact Anthropic support if permissions appear correct but the error persists.

Request & Validation Errors

400 invalid_request_error Do Not Retry
Cause

The request is malformed. Common causes: invalid JSON, missing required fields (model, messages, max_tokens), incorrect message role sequence, invalid tool definition schema, or a parameter value outside the allowed range.

Fix

Read the error message โ€” it almost always tells you exactly which field is wrong. Validate your JSON. Check that messages alternate human/assistant correctly. Confirm max_tokens doesn't exceed the model's limit. Review tool definitions against the tool use schema.

404 not_found_error Do Not Retry
Cause

The requested resource doesn't exist. Most often caused by an invalid model identifier string โ€” e.g., passing claude-sonnet instead of claude-sonnet-4-6. Also occurs with incorrect API endpoint paths.

Fix

Verify the exact model string from Anthropic's model documentation. Model IDs include the full version string. Store model names as constants and validate them at startup, not at runtime.

# Correct model identifiers (March 2026)
MODELS = {
    "opus": "claude-opus-4-6",
    "sonnet": "claude-sonnet-4-6",
    "haiku": "claude-haiku-4-5-20251001"
}

# Validate at startup
import anthropic
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

Rate Limit Errors

429 rate_limit_error Always Retry
Cause

You've exceeded your account's rate limit โ€” either requests per minute (RPM), tokens per minute (TPM), or tokens per day (TPD). The response headers include retry-after indicating when you can try again. Rate limits vary by model and account tier.

Fix

Implement exponential backoff with jitter. Respect the retry-after header. For sustained high-volume workloads, use the Batch API (50% cheaper, no rate limits on throughput). Contact Anthropic to increase your rate limit tier if you consistently hit limits.

import anthropic
import time
import random

def call_claude_with_retry(client, max_retries=5, **kwargs):
    """Production retry logic for Claude API calls."""
    for attempt in range(max_retries):
        try:
            return client.messages.create(**kwargs)
        except anthropic.RateLimitError as e:
            if attempt == max_retries - 1:
                raise
            # Respect retry-after header if present
            retry_after = getattr(e, 'retry_after', None)
            if retry_after:
                wait_time = retry_after
            else:
                # Exponential backoff with jitter
                wait_time = (2 ** attempt) + random.uniform(0, 1)
            print(f"Rate limited. Waiting {wait_time:.1f}s (attempt {attempt+1}/{max_retries})")
            time.sleep(wait_time)
        except anthropic.APIStatusError as e:
            if e.status_code >= 500:
                # Server errors: retry with backoff
                if attempt == max_retries - 1:
                    raise
                wait_time = (2 ** attempt) + random.uniform(0, 1)
                time.sleep(wait_time)
            else:
                # Client errors (400, 401, 403, 404): don't retry
                raise

Server Errors

500 api_error Always Retry
Cause

An internal error on Anthropic's side. Not caused by your request. Typically transient โ€” often resolves within seconds. Can occur during infrastructure events or rare model processing failures.

Fix

Retry with exponential backoff. After 3 retries, log the error and alert if the pattern persists. Check the Anthropic status page (status.anthropic.com) if you see sustained 500s. Never surface raw 500 errors to end users.

529 overloaded_error Retry with Long Backoff
Cause

Anthropic's API is temporarily overloaded. Not a rate limit on your account โ€” it's a capacity constraint on the platform side. Occurs during peak demand periods or after a major feature launch when usage spikes.

Fix

Use longer backoff intervals than for standard 5xx errors. Consider switching to a less-loaded model tier (e.g., Haiku during Sonnet overload periods) if your use case allows. Queue requests and process them over a longer window rather than hammering the API.

Content & Context Errors

400 / context_length_exceeded Context Length Exceeded Do Not Retry
Cause

The total tokens in your request (system prompt + messages + tools + max_tokens reserved for output) exceed the model's context window. Each model has a different context limit. Sonnet and Haiku have a 200K token context window; Opus has a 200K context window as well.

Fix

Implement context window management: truncate or summarise earlier messages, chunk large documents, use RAG to retrieve only relevant passages rather than passing full documents, or use prompt caching for the static portions of your context.

stop_reason: "max_tokens" Truncated Output Handle Conditionally
Cause

Not technically an error โ€” Claude's response was cut off because it reached the max_tokens limit. The response is valid but incomplete. Check response.stop_reason on every API call.

Fix

If truncation is unacceptable for your use case, increase max_tokens and/or implement continuation logic: detect stop_reason == "max_tokens", then send a follow-up request asking Claude to continue from where it left off.

Building Production Claude API Applications?

Our architects have designed Claude API integrations handling millions of requests per month. We handle error handling, retry logic, rate limit management, and cost optimisation.

Book a Free Architecture Review โ†’

Error Monitoring in Production

Logging individual errors isn't enough. Production Claude API deployments need error rate monitoring with alerting thresholds. A sudden spike in 5xx errors may indicate an Anthropic outage. A rising baseline of 400 errors may indicate a schema change breaking your request format. A 401 spike may indicate API key rotation that didn't propagate correctly.

Track these metrics as time-series: 429 error rate (rate limit pressure), 5xx error rate (infrastructure health), 400 error rate by error type (request quality), and average response latency with p95 and p99 percentiles. Export them to your existing observability stack โ€” Datadog, Grafana, CloudWatch โ€” using the same patterns as your other API integrations.

import anthropic
from dataclasses import dataclass
from typing import Optional
import logging

logger = logging.getLogger(__name__)

@dataclass
class APICallResult:
    success: bool
    response: Optional[anthropic.types.Message]
    error_type: Optional[str]
    error_code: Optional[int]
    attempt_count: int
    total_latency_ms: float

def tracked_claude_call(client, **kwargs) -> APICallResult:
    """Wrapper that records metrics for every Claude API call."""
    import time
    start = time.monotonic()
    attempts = 0

    try:
        response = call_claude_with_retry(client, **kwargs)
        latency = (time.monotonic() - start) * 1000
        # Emit to your metrics system here
        return APICallResult(True, response, None, None, attempts, latency)
    except anthropic.APIStatusError as e:
        latency = (time.monotonic() - start) * 1000
        logger.error(f"Claude API error: {e.status_code} {e.message}")
        # Alert if error_rate > threshold
        return APICallResult(False, None, type(e).__name__, e.status_code, attempts, latency)

Quick Reference: Error Code Cheatsheet

Print this and put it next to your screen during integration work.

โšก Retry Decision Matrix
Always retry: 429 (rate limit), 500 (server error), 529 (overloaded)
Never retry: 400 (bad request), 401 (auth failed), 403 (forbidden), 404 (not found)
Handle conditionally: Truncated responses (stop_reason: max_tokens), network timeouts

Related Articles

๐Ÿ”ง
ClaudeImplementation Team

Claude Certified Architects specialising in enterprise AI deployment. About us โ†’

Ship Claude API Integrations That Don't Break in Production

Most Claude API integrations fail in production because error handling, rate limit management, and monitoring weren't built in from day one. We design production-grade integrations that hold up at enterprise scale.