Overview
How the IBEX Harness proxy sits between clients and LLM providers.
The proxy is the public HTTP edge for IBEX Harness. Every protected request passes through authentication, agent identity verification, rate limiting, and request normalization before any provider handoff. In Phase 1 the critical path stops at validation — chat routes return 501 PROVIDER_NOT_CONFIGURED until Phase 2 registers a provider adapter.
Role in the platform
The proxy is stateless by design: it holds no Postgres connection in Phase 1. Identity lookups go through the auth service over gRPC; rate-limit counters live in Redis. That separation keeps the critical path horizontally scalable and avoids multi-master write complexity.
See Architecture overview for the full system diagram and Request lifecycle for the end-to-end flow.
Endpoint surface
| Route | Auth | Purpose |
|---|---|---|
GET /health | No | Liveness — minimal JSON per ADR-0022 |
GET /ready | No | Readiness — probes auth_grpc and redis when configured |
GET /metrics | No | Prometheus text exposition |
GET /v1/internal/auth-probe | PAT + agent | Returns {org_id, permissions} from validated token |
GET /v1/orgs/{org_id}/auth-probe | PAT + agent | Same probe; path org_id must match token org |
POST /v1/chat/completions | PAT + agent + ProxyChatCompletion | OpenAI-compatible chat; 501 until provider configured |
Organization scope for chat comes from the validated token, not the URL path. Cross-tenant path probes return 403 — see Tenant isolation.
Middleware pipeline
Global middleware wraps every route: metrics → request context → response headers → logging → mux.
Protected routes add route-specific chains. For chat completions:
Body limit and Content-Type
Rejects oversize payloads and non-JSON POST bodies before auth runs (ADR-0013).
Token validation
Calls auth ValidateToken over gRPC with a 50ms production budget (ADR-0011).
Agent verification
Requires X-IBEX-Agent-ID; confirms active agent belongs to token org (ADR-0016).
Rate limit
Org-level RPM in Redis; fail-open when Redis is unavailable (ADR-0015).
Normalize and validate
Parses OpenAI chat JSON; semantic errors return field_errors in the envelope (ADR-0012).
Auth-probe routes skip body limit and Content-Type checks but still run auth, agent verify, and rate limit.
Failure modes
| Dependency | Behavior | HTTP signal |
|---|---|---|
| Auth gRPC down / timeout | Fail closed on token validation | 503 SERVICE_DEGRADED |
| Auth gRPC down on agent verify | Fail closed | 503 AUTH_UNAVAILABLE |
| Redis down | Rate limit skipped (fail-open) | Request proceeds; monitor /ready |
| No provider registered | Expected in Phase 1 | 501 PROVIDER_NOT_CONFIGURED |
Full threat-model context: Security overview and Authentication.
Response headers
Every response includes correlation headers (names configurable via env):
X-Request-ID— UUID v7 when generated; valid inbound v4/v7 UUIDs are honoured (ADR-0017)X-Trace-ID— OpenTelemetry trace correlationX-Response-Time— server-side duration
Protected routes with rate limiting enabled also emit X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset. 429 responses add Retry-After.
Verify locally
Boot dependencies
make compose-dev-up && make db-migrate && make db-seed
Start auth then proxy
Auth must listen on gRPC 9091 before the proxy starts. See Configuration.
Health check
curl -s http://localhost:8080/health — expect HTTP 200.
Smoke test
make dev-smoke exercises probes, auth failures, and the chat stub.
Related guides
- Authentication — headers, error codes, probe examples
- Configuration — env vars and readiness dependencies
- Rate limiting — RPM budgets and Redis keys
- Request routing — normalization rules and chat schema
- Provider adapters — Phase 2 forwarding contract
Was this page helpful?
Last updated on