Overview

The proxy is the public HTTP edge for IBEX Harness. Every protected request passes through authentication, agent identity verification, rate limiting, and request normalization before any provider handoff. In Phase 1 the critical path stops at validation — chat routes return 501 PROVIDER_NOT_CONFIGURED until Phase 2 registers a provider adapter.

Role in the platform

The proxy is stateless by design: it holds no Postgres connection in Phase 1. Identity lookups go through the auth service over gRPC; rate-limit counters live in Redis. That separation keeps the critical path horizontally scalable and avoids multi-master write complexity.

See Architecture overview for the full system diagram and Request lifecycle for the end-to-end flow.

+-------------+       +-------------+                        +-----------------+     +----------+
|             |       |             |                        |                 |     |          |
| Agent / SDK |-HTTPS>| Proxy :8080 |     ---ValidateToken-->| Auth gRPC :9091 |---->| Postgres |
|             |       |             |                        |                 |     |          |
+-------------+       +-------------+                        +-----------------+     +----------+
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                               +-----------------+                 
                             :                               |                 |                 
                             +-----------INCR-rpm----------->|      Redis      |                 
                             :                               |                 |                 
                             :                               +-----------------+                 
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                               +-----------------+                 
                             :                               |                 |                 
                             +............Phase.2...........>|   LLM provider  |                 
                                                             |                 |                 
                                                             +-----------------+

Endpoint surface

Route	Auth	Purpose
`GET /health`	No	Liveness — minimal JSON per ADR-0022
`GET /ready`	No	Readiness — probes `auth_grpc` and `redis` when configured
`GET /metrics`	No	Prometheus text exposition
`GET /v1/internal/auth-probe`	PAT + agent	Returns `{org_id, permissions}` from validated token
`GET /v1/orgs/{org_id}/auth-probe`	PAT + agent	Same probe; path `org_id` must match token org
`POST /v1/chat/completions`	PAT + agent + `ProxyChatCompletion`	OpenAI-compatible chat; 501 until provider configured

Organization scope for chat comes from the validated token, not the URL path. Cross-tenant path probes return 403 — see Tenant isolation.

Middleware pipeline

Global middleware wraps every route: metrics → request context → response headers → logging → mux.

Protected routes add route-specific chains. For chat completions:

Body limit and Content-Type

Rejects oversize payloads and non-JSON POST bodies before auth runs (ADR-0013).

Token validation

Calls auth ValidateToken over gRPC with a 50ms production budget (ADR-0011).

Agent verification

Requires X-IBEX-Agent-ID; confirms active agent belongs to token org (ADR-0016).

Rate limit

Org-level RPM in Redis; fail-open when Redis is unavailable (ADR-0015).

Normalize and validate

Parses OpenAI chat JSON; semantic errors return field_errors in the envelope (ADR-0012).

Auth-probe routes skip body limit and Content-Type checks but still run auth, agent verify, and rate limit.

Failure modes

Dependency	Behavior	HTTP signal
Auth gRPC down / timeout	Fail closed on token validation	`503 SERVICE_DEGRADED`
Auth gRPC down on agent verify	Fail closed	`503 AUTH_UNAVAILABLE`
Redis down	Rate limit skipped (fail-open)	Request proceeds; monitor `/ready`
No provider registered	Expected in Phase 1	`501 PROVIDER_NOT_CONFIGURED`

Full threat-model context: Security overview and Authentication.

Response headers

Every response includes correlation headers (names configurable via env):

X-Request-ID — UUID v7 when generated; valid inbound v4/v7 UUIDs are honoured (ADR-0017)
X-Trace-ID — OpenTelemetry trace correlation
X-Response-Time — server-side duration

Protected routes with rate limiting enabled also emit X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset. 429 responses add Retry-After.

Verify locally

Boot dependencies

make compose-dev-up && make db-migrate && make db-seed

Start auth then proxy

Auth must listen on gRPC 9091 before the proxy starts. See Configuration.

Health check

curl -s http://localhost:8080/health — expect HTTP 200.

Smoke test

make dev-smoke exercises probes, auth failures, and the chat stub.

Authentication — headers, error codes, probe examples
Configuration — env vars and readiness dependencies
Rate limiting — RPM budgets and Redis keys
Request routing — normalization rules and chat schema
Provider adapters — Phase 2 forwarding contract

Was this page helpful?

Role in the platform

See Architecture overview for the full system diagram and Request lifecycle for the end-to-end flow.

+-------------+       +-------------+                        +-----------------+     +----------+
|             |       |             |                        |                 |     |          |
| Agent / SDK |-HTTPS>| Proxy :8080 |     ---ValidateToken-->| Auth gRPC :9091 |---->| Postgres |
|             |       |             |                        |                 |     |          |
+-------------+       +-------------+                        +-----------------+     +----------+
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                               +-----------------+                 
                             :                               |                 |                 
                             +-----------INCR-rpm----------->|      Redis      |                 
                             :                               |                 |                 
                             :                               +-----------------+                 
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                               +-----------------+                 
                             :                               |                 |                 
                             +............Phase.2...........>|   LLM provider  |                 
                                                             |                 |                 
                                                             +-----------------+

Endpoint surface

Route	Auth	Purpose
`GET /health`	No	Liveness — minimal JSON per ADR-0022
`GET /ready`	No	Readiness — probes `auth_grpc` and `redis` when configured
`GET /metrics`	No	Prometheus text exposition
`GET /v1/internal/auth-probe`	PAT + agent	Returns `{org_id, permissions}` from validated token
`GET /v1/orgs/{org_id}/auth-probe`	PAT + agent	Same probe; path `org_id` must match token org
`POST /v1/chat/completions`	PAT + agent + `ProxyChatCompletion`	OpenAI-compatible chat; 501 until provider configured

Organization scope for chat comes from the validated token, not the URL path. Cross-tenant path probes return 403 — see Tenant isolation.

Middleware pipeline

Global middleware wraps every route: metrics → request context → response headers → logging → mux.

Protected routes add route-specific chains. For chat completions:

Body limit and Content-Type

Rejects oversize payloads and non-JSON POST bodies before auth runs (ADR-0013).

Token validation

Calls auth ValidateToken over gRPC with a 50ms production budget (ADR-0011).

Agent verification

Requires X-IBEX-Agent-ID; confirms active agent belongs to token org (ADR-0016).

Rate limit

Org-level RPM in Redis; fail-open when Redis is unavailable (ADR-0015).

Normalize and validate

Parses OpenAI chat JSON; semantic errors return field_errors in the envelope (ADR-0012).

Auth-probe routes skip body limit and Content-Type checks but still run auth, agent verify, and rate limit.

Failure modes

Dependency	Behavior	HTTP signal
Auth gRPC down / timeout	Fail closed on token validation	`503 SERVICE_DEGRADED`
Auth gRPC down on agent verify	Fail closed	`503 AUTH_UNAVAILABLE`
Redis down	Rate limit skipped (fail-open)	Request proceeds; monitor `/ready`
No provider registered	Expected in Phase 1	`501 PROVIDER_NOT_CONFIGURED`

Full threat-model context: Security overview and Authentication.

Response headers

Every response includes correlation headers (names configurable via env):

X-Request-ID — UUID v7 when generated; valid inbound v4/v7 UUIDs are honoured (ADR-0017)
X-Trace-ID — OpenTelemetry trace correlation
X-Response-Time — server-side duration

Protected routes with rate limiting enabled also emit X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset. 429 responses add Retry-After.

Verify locally

Boot dependencies

make compose-dev-up && make db-migrate && make db-seed

Start auth then proxy

Auth must listen on gRPC 9091 before the proxy starts. See Configuration.

Health check

curl -s http://localhost:8080/health — expect HTTP 200.

Smoke test

make dev-smoke exercises probes, auth failures, and the chat stub.

Authentication — headers, error codes, probe examples
Configuration — env vars and readiness dependencies
Rate limiting — RPM budgets and Redis keys
Request routing — normalization rules and chat schema
Provider adapters — Phase 2 forwarding contract

Was this page helpful?