CanalAPI
Developer Guide

Rate Limits

Understand and handle CanalAPI rate limits.

CanalAPI applies rate limits to protect platform stability and to fairly allocate capacity across users.

How limits are applied

Limits may be evaluated per API key, per account, or per model. The exact limits for your account are visible in the console — refer to it for the current values, since limits can change.

Typical dimensions include:

  • Requests per minute (RPM) — number of requests in a sliding window.
  • Tokens per minute (TPM) — total input + output tokens in a sliding window.
  • Concurrent requests — number of in-flight requests.

Detecting a rate-limit response

When a limit is exceeded, CanalAPI returns:

  • HTTP 429 Too Many Requests
  • An error.code such as rate_limit_exceeded
  • A Retry-After header (seconds) when the wait time can be determined
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 5

{
  "error": {
    "message": "Rate limit exceeded.",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}
  1. Honor Retry-After when present. Sleep for at least that many seconds before the next attempt.
  2. Use exponential backoff with jitter when Retry-After is not present. For example: min(2^attempt, 30) * (0.75 + random() * 0.5) seconds.
  3. Cap retries (commonly 5). Beyond that, surface the error to the caller.
  4. Watch for client-side bursts. If you fan out N concurrent requests when a user clicks once, throttle on the client.

Capacity tips

  • Use streaming for long completions to lower wall-clock per request, so retries cost less.
  • Batch independent prompts where possible.
  • For ingest-heavy workloads, prefer smaller, faster models when accuracy permits.

Quotas

Soft and hard quotas are separate from instantaneous rate limits. They are configured on the Billing page and prevent runaway spend.

  • A soft quota triggers an alert.
  • A hard quota stops accepting new requests for that key or account.

Set both to give yourself a chance to react before being cut off.

On this page