Rate limiting
Per-org request budgets enforced at the proxy edge.
The proxy enforces organization-level requests-per-minute (RPM) limits using a Redis sliding window (ADR-0015). Limits apply after authentication and agent verification succeed and before request normalization hands off to a provider adapter.
Rate limiting is a cost control, not a security boundary. Auth and tenant isolation remain fail-closed; the limiter intentionally fails open when Redis is down.
How it works
Implementation lives in packages/ratelimit — services must use the Limiter interface, not ad-hoc Redis calls. Key format: ratelimit:{org_id}:rpm:{unix_minute} with a TTL on first increment.
Configuration
| Parameter | Type | Description |
|---|---|---|
IBEX_RATE_LIMIT_DEFAULT_RPM | integer | Default requests per minute for all orgs without an override. Default: 60 |
IBEX_RATE_LIMIT_ORG_OVERRIDES | string | Comma-separated org_uuid=rpm pairs. Example: 550e8400-e29b-41d4-a716-446655440000=120 |
REDIS_URLRequired | string (URL) | Redis connection used by the limiter and /ready probe. |
Example overrides:
IBEX_RATE_LIMIT_DEFAULT_RPM=60
IBEX_RATE_LIMIT_ORG_OVERRIDES=00000000-0000-0000-0000-000000000001=120,00000000-0000-0000-0000-000000000002=30Org IDs in overrides must be valid UUIDs. The limiter resolves the org from the authenticated token, never from the request body.
Response headers
When rate limiting is active and Redis is healthy, protected responses include:
| Header | Meaning |
|---|---|
X-RateLimit-Limit | Configured RPM for this org |
X-RateLimit-Remaining | Requests left in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
On 429, the proxy also sets Retry-After (seconds) so clients can back off without guessing.
Rate-limited error envelope
{
"error": {
"code": "RATE_LIMITED",
"message": "Rate limit exceeded",
"request_id": "0192a3b4-c5d6-7890-abcd-ef1234567890",
"timestamp": "2026-06-14T12:00:00Z"
}
}The request_id matches X-Request-ID on the response and is propagated to auth gRPC metadata for log correlation (ADR-0017).
Routes subject to limiting
All protected /v1/* routes share the same org-level bucket in Phase 1:
GET /v1/internal/auth-probeGET /v1/orgs/{org_id}/auth-probePOST /v1/chat/completions
There is no per-agent or per-IP tier yet. Hierarchical limits (agent → org → global) are planned for Phase 2+.
Verify the limiter
Start test stack
make compose-test-up — Postgres on 5433; Redis from compose.
Run security tests
go test -tags=integration -run TestSecurity_SEC4 ./services/proxy/...
Optional smoke burst
make dev-smoke may WARN if 429 is not observed in 65 rapid probe requests — acceptable on fast hardware.
CI runs the full security-integration job including Redis fail-open cases. See Testing strategy and current state.
Operational guidance
- Staging: Keep
IBEX_RATE_LIMIT_DEFAULT_RPMlow enough to catch runaway integration tests. - Production: Set per-org overrides for high-traffic tenants; never embed org IDs in metric labels (cardinality explosion).
- Security: Pair rate limits with auth — unauthenticated traffic should not reach the limiter on protected routes.
Related
- Configuration — full env var reference
- Authentication — limiter runs after auth succeeds
- Tenant isolation — Redis key namespacing by org_id
- ADR-0015 — design rationale and race window
Was this page helpful?
Last updated on