ibexharness
DocsBlogReleasesRoadmap
GitHub
ibexharness

Documentation

OverviewConfigurationAuthenticationRate limitingRequest routingProvider adapters
Proxy›Overview
Proxy

Overview

How the IBEX Harness proxy sits between clients and LLM providers.

The proxy is the public HTTP edge for IBEX Harness. Every protected request passes through authentication, agent identity verification, rate limiting, and request normalization before any provider handoff. In Phase 1 the critical path stops at validation — chat routes return 501 PROVIDER_NOT_CONFIGURED until Phase 2 registers a provider adapter.

Phase 1 scope

Auth, agent verify, semantic validation, and org-level rate limits are production-ready today. LLM forwarding, context injection, and memory retrieval are not wired yet. Track live status on current state.

Role in the platform

The proxy is stateless by design: it holds no Postgres connection in Phase 1. Identity lookups go through the auth service over gRPC; rate-limit counters live in Redis. That separation keeps the critical path horizontally scalable and avoids multi-master write complexity.

See Architecture overview for the full system diagram and Request lifecycle for the end-to-end flow.

Mermaid diagram: flowchart LR
+-------------+       +-------------+                        +-----------------+     +----------+
|             |       |             |                        |                 |     |          |
| Agent / SDK |-HTTPS>| Proxy :8080 |     ---ValidateToken-->| Auth gRPC :9091 |---->| Postgres |
|             |       |             |                        |                 |     |          |
+-------------+       +-------------+                        +-----------------+     +----------+
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                               +-----------------+                 
                             :                               |                 |                 
                             +-----------INCR-rpm----------->|      Redis      |                 
                             :                               |                 |                 
                             :                               +-----------------+                 
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                                                                   
                             :                               +-----------------+                 
                             :                               |                 |                 
                             +............Phase.2...........>|   LLM provider  |                 
                                                             |                 |                 
                                                             +-----------------+                 

Endpoint surface

RouteAuthPurpose
GET /healthNoLiveness — minimal JSON per ADR-0022
GET /readyNoReadiness — probes auth_grpc and redis when configured
GET /metricsNoPrometheus text exposition
GET /v1/internal/auth-probePAT + agentReturns {org_id, permissions} from validated token
GET /v1/orgs/{org_id}/auth-probePAT + agentSame probe; path org_id must match token org
POST /v1/chat/completionsPAT + agent + ProxyChatCompletionOpenAI-compatible chat; 501 until provider configured

Organization scope for chat comes from the validated token, not the URL path. Cross-tenant path probes return 403 — see Tenant isolation.

Middleware pipeline

Global middleware wraps every route: metrics → request context → response headers → logging → mux.

Protected routes add route-specific chains. For chat completions:

1

Body limit and Content-Type

Rejects oversize payloads and non-JSON POST bodies before auth runs (ADR-0013).

2

Token validation

Calls auth ValidateToken over gRPC with a 50ms production budget (ADR-0011).

3

Agent verification

Requires X-IBEX-Agent-ID; confirms active agent belongs to token org (ADR-0016).

4

Rate limit

Org-level RPM in Redis; fail-open when Redis is unavailable (ADR-0015).

5

Normalize and validate

Parses OpenAI chat JSON; semantic errors return field_errors in the envelope (ADR-0012).

Auth-probe routes skip body limit and Content-Type checks but still run auth, agent verify, and rate limit.

Failure modes

DependencyBehaviorHTTP signal
Auth gRPC down / timeoutFail closed on token validation503 SERVICE_DEGRADED
Auth gRPC down on agent verifyFail closed503 AUTH_UNAVAILABLE
Redis downRate limit skipped (fail-open)Request proceeds; monitor /ready
No provider registeredExpected in Phase 1501 PROVIDER_NOT_CONFIGURED

Full threat-model context: Security overview and Authentication.

Response headers

Every response includes correlation headers (names configurable via env):

  • X-Request-ID — UUID v7 when generated; valid inbound v4/v7 UUIDs are honoured (ADR-0017)
  • X-Trace-ID — OpenTelemetry trace correlation
  • X-Response-Time — server-side duration

Protected routes with rate limiting enabled also emit X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset. 429 responses add Retry-After.

Verify locally

1

Boot dependencies

make compose-dev-up && make db-migrate && make db-seed

2

Start auth then proxy

Auth must listen on gRPC 9091 before the proxy starts. See Configuration.

3

Health check

curl -s http://localhost:8080/health — expect HTTP 200.

4

Smoke test

make dev-smoke exercises probes, auth failures, and the chat stub.

Related guides

  • Authentication — headers, error codes, probe examples
  • Configuration — env vars and readiness dependencies
  • Rate limiting — RPM budgets and Redis keys
  • Request routing — normalization rules and chat schema
  • Provider adapters — Phase 2 forwarding contract

Was this page helpful?

Edit on GitHub

Last updated on

PreviousRequest lifecycleNextConfiguration

On this page

  • Role in the platform
  • Endpoint surface
  • Middleware pipeline
  • Failure modes
  • Response headers
  • Verify locally
  • Related guides
0%