Request lifecycle

A protected LLM request enters the proxy as an OpenAI-compatible HTTP call. Before any provider handoff, the proxy validates credentials, confirms agent identity, and enforces rate limits. In Phase 1 the pipeline stops after body normalization and returns 501 PROVIDER_NOT_CONFIGURED — a successful 501 means auth and agent checks passed.

POST/v1/orgs/{org_id}/chat/completions

OpenAI-compatible chat completions. Phase 1 returns 501 after auth succeeds; Phase 2 forwards to a registered provider adapter.

Required headers

Header	Required	Notes
`Authorization`	Yes	`Bearer` + PAT (`ibex_pat_...`)
`X-IBEX-Agent-ID`	Yes	UUID; must belong to `{org_id}` in path
`Content-Type`	Yes (POST)	`application/json`
`X-Request-ID`	No	UUID v7; generated if absent

Lifecycle steps

Request ID assigned

Middleware assigns or validates X-Request-ID (UUID v7) and injects it into the request context for logs, metrics, and gRPC metadata propagation.

Bearer token validated

Proxy calls auth ValidateToken over gRPC with a 50ms deadline. On success, org_id, permissions, and token_id attach to context. Missing token → 401; auth down → 503 fail-closed per ADR-0011.

Agent identity verified

Proxy requires X-IBEX-Agent-ID and confirms the agent belongs to the org in the URL. Cross-org or unknown agent → 403 before the body is read.

Rate limit checked

Redis sliding-window counter keyed by org_id. Exceeded → 429 with Retry-After. Redis unavailable → fail-open with conservative local limits and audit warning.

Body normalized

JSON parsed and validated against the OpenAI chat schema. Malformed input → 400 with stable error envelope including request_id.

Provider handoff (Phase 2+)

Context assembly, memory injection, and streaming forward to the LLM provider. Phase 1 stops here with 501 PROVIDER_NOT_CONFIGURED.

Sequence diagram

+--------+                                    +-------+                           +-------+              +------+  +--------------+   
| Client |                                    | Proxy |                           | Redis |              | Auth |  | LLM Provider |   
+--------+                                    +-------+                           +-------+              +------+  +--------------+   
     |                                            |                                   |                      |             |          
     |  POST /v1/orgs/{org_id}/chat/completions   |                                   |                      |             |          
     |-------------------------------------------->                                   |                      |             |          
     |                                            |                                   |                      |             |          
     |                                  +-------------------+                         |                      |             |          
     |                                  | Assign request_id |                         |                      |             |          
     |                                  +-------------------+                         |                      |             |          
     |                                            |                                   |                      |             |          
     |                                            |            gRPC ValidateToken (50ms budget)              |             |          
     |                                            |---------------------------------------------------------->             |          
     |                                            |                                   |                      |             |          
     |                                            |              org_id, permissions, token_id               |             |          
     |                                            <..........................................................|             |          
     |                                            |                                   |                      |             |          
     |                                            |          gRPC ValidateAgent (agent_id, org_id)           |             |          
     |                                            |---------------------------------------------------------->             |          
     |                                            |                                   |                      |             |          
     |                                            |            agent record or PERMISSION_DENIED             |             |          
     |                                            <..........................................................|             |          
     |                                            |                                   |                      |             |          
     |                                            |  INCR ratelimit:{org_id}:minute   |                      |             |          
     |                                            |----------------------------------->                      |             |          
     |                                            |                                   |                      |             |          
     |                                            |         allowed / denied          |                      |             |          
     |                                            <...................................|                      |             |          
     |                                            |                                   |                      |             |          
     |                                 +---------------------+                        |                      |             |          
     |                                 | Normalize JSON body |                        |                      |             |          
     |                                 +---------------------+                        |                      |             |          
     |                                            |                                   |                      |             |          
 +alt [Phase 1]----------------------------------------------------------------------------------------------------------------+      
 |   |                                            |                                   |                      |             |   |      
 |   |        501 PROVIDER_NOT_CONFIGURED         |                                   |                      |             |   |      
 |   <............................................|                                   |                      |             |   |      
 |   |                                            |                                   |                      |             |   |      
 +[Phase 2 target]-------------------------------------------------------------------------------------------------------------+      
 |   |                                            |                                   |                      |             |   |      
 |   |                                            +---+                               |                      |             |   |      
 |   |                                            |   | Context retrieval (40ms deadline)                    |             |   |      
 |   |                                            <---+                               |                      |             |   |      
 |   |                                            |                                   |                      |             |   |      
 |   |                                            |                       Forward augmented request          |             |   |      
 |   |                                            |------------------------------------------------------------------------>   |      
 |   |                                            |                                   |                      |             |   |      
 |   |                                            |                             Stream tokens                |             |   |      
 |   |                                            <........................................................................|   |      
 |   |                                            |                                   |                      |             |   |      
 |   |              Stream response               |                                   |                      |             |   |      
 |   <............................................|                                   |                      |             |   |      
 |   |                                            |                                   |                      |             |   |      
 |   |                      +-------------------------------------------+             |                      |             |   |      
 |   |                      | Async: trace → ClickHouse, extraction job |             |                      |             |   |      
 |   |                      +-------------------------------------------+             |                      |             |   |      
 |   |                                            |                                   |                      |             |   |      
 +-----------------------------------------------------------------------------------------------------------------------------+      
     |                                            |                                   |                      |             |          
+--------+                                    +-------+                           +-------+              +------+  +--------------+   
| Client |                                    | Proxy |                           | Redis |              | Auth |  | LLM Provider |   
+--------+                                    +-------+                           +-------+              +------+  +--------------+

Dashed Phase 2 steps (context retrieval, provider streaming, async jobs) are specified in engineering docs but not executed in the current release.

Phase 1 probe

bash

curl -s -w "\nHTTP %{http_code}\n" \
  -X POST "http://localhost:8080/v1/orgs/${IBEX_DEV_ORG_ID}/chat/completions" \
  -H "Authorization: Bearer ${IBEX_DEV_TOKEN}" \
  -H "X-IBEX-Agent-ID: ${IBEX_DEV_AGENT_ID}" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"ping"}]}'

Expected: HTTP 501 with PROVIDER_NOT_CONFIGURED — confirms token validation, agent verify, rate limiting, and normalization all succeeded.

Error mapping

Condition	HTTP	Code
Missing `Authorization`	401	`MISSING_TOKEN`
Invalid or revoked PAT	401	`INVALID_TOKEN`
Agent not in org / path mismatch	403	`INSUFFICIENT_PERMISSIONS`
Rate limit exceeded	429	`RATE_LIMIT_EXCEEDED`
Auth gRPC timeout or unavailable	503	`SERVICE_DEGRADED`
No provider configured	501	`PROVIDER_NOT_CONFIGURED`

Full envelope: API errors.

Target path (Phase 2+)

Once provider adapters and context assembly ship, the synchronous path extends:

Parallel context retrieval (40ms deadline) — directive from Redis, hot memories, recent session history
Context assembly gRPC — rank and pack memories within model token budget
Provider forward — augment messages, stream response to client while accumulating for async extraction
Async side effects — ClickHouse trace, memory extraction job, session heartbeat update

Auth validation in Phase 2 may add an optional bloom filter + LRU cache (ADR-0011 deferral record); Phase 1 always calls gRPC.

Architecture decisions

Topic	ADR
Proxy → auth gRPC client	ADR-0011
Token validation contract	ADR-0007
Permission bitmap	ADR-0009
Rate limit skeleton	ADR-0015
Agent identity verification	ADR-0016
Request ID propagation	ADR-0017

Was this page helpful?