ibexharness
DocsBlogReleasesRoadmap
GitHub
ibexharness

Documentation

OverviewConfigurationAuthenticationRate limitingRequest routingProvider adapters
Proxy›Rate limiting
Proxy

Rate limiting

Per-org request budgets enforced at the proxy edge.

The proxy enforces organization-level requests-per-minute (RPM) limits using a Redis sliding window (ADR-0015). Limits apply after authentication and agent verification succeed and before request normalization hands off to a provider adapter.

Rate limiting is a cost control, not a security boundary. Auth and tenant isolation remain fail-closed; the limiter intentionally fails open when Redis is down.

Fail-open on Redis outage

When Redis is unreachable, rate limiting is skipped and requests proceed. Monitor /ready and Redis metrics in production. Phase 2 may add conservative in-memory fallbacks.

How it works

Mermaid diagram: flowchart TD
+------------------------------+                                         
|                              |                                         
|    Authenticated request     |                                         
|                              |                                         
+------------------------------+                                         
                |                                                        
                |                                                        
                |                                                        
                |                                                        
                v                                                        
<------------------------------>                                         
|                              |                                         
|       Redis reachable?       |-----------------------+                 
|                              |                       |                 
<------------------------------>                      Yes                
                |                                      |                 
               No                                      |                 
                |                                      |                 
                |                                      |                 
                v                                      v                 
+------------------------------+     +----------------------------------+
|                              |     |                                  |
|  Allow request — fail-open   |     | INCR ratelimit:org_id:rpm:minute |
|                              |     |                                  |
+------------------------------+     +----------------------------------+
                                                       |                 
                                                       |                 
                                                       |                 
                                                       |                 
                                                       |                 
<------------------------------>                       |                 
|                              |                       |                 
|         count ≤ RPM?         |<----------------------+                 
|                              |                       |                 
<------------------------------>                      No                 
                |                                      |                 
               Yes                                     |                 
                |                                      |                 
                |                                      |                 
                v                                      v                 
+------------------------------+     +----------------------------------+
|                              |     |                                  |
| Allow + set response headers |     |  429 RATE_LIMITED + Retry-After  |
|                              |     |                                  |
+------------------------------+     +----------------------------------+

Implementation lives in packages/ratelimit — services must use the Limiter interface, not ad-hoc Redis calls. Key format: ratelimit:{org_id}:rpm:{unix_minute} with a TTL on first increment.

Configuration

ParameterTypeDescription
IBEX_RATE_LIMIT_DEFAULT_RPMinteger
Default requests per minute for all orgs without an override.
Default: 60
IBEX_RATE_LIMIT_ORG_OVERRIDESstring
Comma-separated org_uuid=rpm pairs. Example: 550e8400-e29b-41d4-a716-446655440000=120
REDIS_URLRequiredstring (URL)
Redis connection used by the limiter and /ready probe.

Example overrides:

bash
IBEX_RATE_LIMIT_DEFAULT_RPM=60
IBEX_RATE_LIMIT_ORG_OVERRIDES=00000000-0000-0000-0000-000000000001=120,00000000-0000-0000-0000-000000000002=30

Org IDs in overrides must be valid UUIDs. The limiter resolves the org from the authenticated token, never from the request body.

Response headers

When rate limiting is active and Redis is healthy, protected responses include:

HeaderMeaning
X-RateLimit-LimitConfigured RPM for this org
X-RateLimit-RemainingRequests left in the current window
X-RateLimit-ResetUnix timestamp when the window resets

On 429, the proxy also sets Retry-After (seconds) so clients can back off without guessing.

Rate-limited error envelope

JSON
{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Rate limit exceeded",
    "request_id": "0192a3b4-c5d6-7890-abcd-ef1234567890",
    "timestamp": "2026-06-14T12:00:00Z"
  }
}

The request_id matches X-Request-ID on the response and is propagated to auth gRPC metadata for log correlation (ADR-0017).

Routes subject to limiting

All protected /v1/* routes share the same org-level bucket in Phase 1:

  • GET /v1/internal/auth-probe
  • GET /v1/orgs/{org_id}/auth-probe
  • POST /v1/chat/completions

There is no per-agent or per-IP tier yet. Hierarchical limits (agent → org → global) are planned for Phase 2+.

Verify the limiter

1

Start test stack

make compose-test-up — Postgres on 5433; Redis from compose.

2

Run security tests

go test -tags=integration -run TestSecurity_SEC4 ./services/proxy/...

3

Optional smoke burst

make dev-smoke may WARN if 429 is not observed in 65 rapid probe requests — acceptable on fast hardware.

CI runs the full security-integration job including Redis fail-open cases. See Testing strategy and current state.

Operational guidance

Monitor fail-open

Alert on Redis /ready failures and elevated request volume during outages. Fail-open preserves availability but removes cost protection temporarily.

  • Staging: Keep IBEX_RATE_LIMIT_DEFAULT_RPM low enough to catch runaway integration tests.
  • Production: Set per-org overrides for high-traffic tenants; never embed org IDs in metric labels (cardinality explosion).
  • Security: Pair rate limits with auth — unauthenticated traffic should not reach the limiter on protected routes.

Related

  • Configuration — full env var reference
  • Authentication — limiter runs after auth succeeds
  • Tenant isolation — Redis key namespacing by org_id
  • ADR-0015 — design rationale and race window

Was this page helpful?

Edit on GitHub

Last updated on

PreviousAuthenticationNextRequest routing

On this page

  • How it works
  • Configuration
  • Response headers
  • Rate-limited error envelope
  • Routes subject to limiting
  • Verify the limiter
  • Operational guidance
  • Related
0%