ibexharness
DocsBlogReleasesRoadmap
GitHub
ibexharness

Documentation

OverviewServicesData modelRequest lifecycle
Architecture›Request lifecycle
Architecture

Request lifecycle

End-to-end flow for a protected proxy request — auth, agent verify, rate limits, and Phase 2 forwarding.

A protected LLM request enters the proxy as an OpenAI-compatible HTTP call. Before any provider handoff, the proxy validates credentials, confirms agent identity, and enforces rate limits. In Phase 1 the pipeline stops after body normalization and returns 501 PROVIDER_NOT_CONFIGURED — a successful 501 means auth and agent checks passed.

POST/v1/orgs/{org_id}/chat/completions

OpenAI-compatible chat completions. Phase 1 returns 501 after auth succeeds; Phase 2 forwards to a registered provider adapter.

Required headers

HeaderRequiredNotes
AuthorizationYesBearer + PAT (ibex_pat_...)
X-IBEX-Agent-IDYesUUID; must belong to {org_id} in path
Content-TypeYes (POST)application/json
X-Request-IDNoUUID v7; generated if absent

Lifecycle steps

1

Request ID assigned

Middleware assigns or validates X-Request-ID (UUID v7) and injects it into the request context for logs, metrics, and gRPC metadata propagation.

2

Bearer token validated

Proxy calls auth ValidateToken over gRPC with a 50ms deadline. On success, org_id, permissions, and token_id attach to context. Missing token → 401; auth down → 503 fail-closed per ADR-0011.

3

Agent identity verified

Proxy requires X-IBEX-Agent-ID and confirms the agent belongs to the org in the URL. Cross-org or unknown agent → 403 before the body is read.

4

Rate limit checked

Redis sliding-window counter keyed by org_id. Exceeded → 429 with Retry-After. Redis unavailable → fail-open with conservative local limits and audit warning.

5

Body normalized

JSON parsed and validated against the OpenAI chat schema. Malformed input → 400 with stable error envelope including request_id.

6

Provider handoff (Phase 2+)

Context assembly, memory injection, and streaming forward to the LLM provider. Phase 1 stops here with 501 PROVIDER_NOT_CONFIGURED.

Sequence diagram

Mermaid diagram: sequenceDiagram
+--------+                                    +-------+                           +-------+              +------+  +--------------+   
| Client |                                    | Proxy |                           | Redis |              | Auth |  | LLM Provider |   
+--------+                                    +-------+                           +-------+              +------+  +--------------+   
     |                                            |                                   |                      |             |          
     |  POST /v1/orgs/{org_id}/chat/completions   |                                   |                      |             |          
     |-------------------------------------------->                                   |                      |             |          
     |                                            |                                   |                      |             |          
     |                                  +-------------------+                         |                      |             |          
     |                                  | Assign request_id |                         |                      |             |          
     |                                  +-------------------+                         |                      |             |          
     |                                            |                                   |                      |             |          
     |                                            |            gRPC ValidateToken (50ms budget)              |             |          
     |                                            |---------------------------------------------------------->             |          
     |                                            |                                   |                      |             |          
     |                                            |              org_id, permissions, token_id               |             |          
     |                                            <..........................................................|             |          
     |                                            |                                   |                      |             |          
     |                                            |          gRPC ValidateAgent (agent_id, org_id)           |             |          
     |                                            |---------------------------------------------------------->             |          
     |                                            |                                   |                      |             |          
     |                                            |            agent record or PERMISSION_DENIED             |             |          
     |                                            <..........................................................|             |          
     |                                            |                                   |                      |             |          
     |                                            |  INCR ratelimit:{org_id}:minute   |                      |             |          
     |                                            |----------------------------------->                      |             |          
     |                                            |                                   |                      |             |          
     |                                            |         allowed / denied          |                      |             |          
     |                                            <...................................|                      |             |          
     |                                            |                                   |                      |             |          
     |                                 +---------------------+                        |                      |             |          
     |                                 | Normalize JSON body |                        |                      |             |          
     |                                 +---------------------+                        |                      |             |          
     |                                            |                                   |                      |             |          
 +alt [Phase 1]----------------------------------------------------------------------------------------------------------------+      
 |   |                                            |                                   |                      |             |   |      
 |   |        501 PROVIDER_NOT_CONFIGURED         |                                   |                      |             |   |      
 |   <............................................|                                   |                      |             |   |      
 |   |                                            |                                   |                      |             |   |      
 +[Phase 2 target]-------------------------------------------------------------------------------------------------------------+      
 |   |                                            |                                   |                      |             |   |      
 |   |                                            +---+                               |                      |             |   |      
 |   |                                            |   | Context retrieval (40ms deadline)                    |             |   |      
 |   |                                            <---+                               |                      |             |   |      
 |   |                                            |                                   |                      |             |   |      
 |   |                                            |                       Forward augmented request          |             |   |      
 |   |                                            |------------------------------------------------------------------------>   |      
 |   |                                            |                                   |                      |             |   |      
 |   |                                            |                             Stream tokens                |             |   |      
 |   |                                            <........................................................................|   |      
 |   |                                            |                                   |                      |             |   |      
 |   |              Stream response               |                                   |                      |             |   |      
 |   <............................................|                                   |                      |             |   |      
 |   |                                            |                                   |                      |             |   |      
 |   |                      +-------------------------------------------+             |                      |             |   |      
 |   |                      | Async: trace → ClickHouse, extraction job |             |                      |             |   |      
 |   |                      +-------------------------------------------+             |                      |             |   |      
 |   |                                            |                                   |                      |             |   |      
 +-----------------------------------------------------------------------------------------------------------------------------+      
     |                                            |                                   |                      |             |          
+--------+                                    +-------+                           +-------+              +------+  +--------------+   
| Client |                                    | Proxy |                           | Redis |              | Auth |  | LLM Provider |   
+--------+                                    +-------+                           +-------+              +------+  +--------------+   

Dashed Phase 2 steps (context retrieval, provider streaming, async jobs) are specified in engineering docs but not executed in the current release.

Phase 1 probe

bash
curl -s -w "\nHTTP %{http_code}\n" \
  -X POST "http://localhost:8080/v1/orgs/${IBEX_DEV_ORG_ID}/chat/completions" \
  -H "Authorization: Bearer ${IBEX_DEV_TOKEN}" \
  -H "X-IBEX-Agent-ID: ${IBEX_DEV_AGENT_ID}" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"ping"}]}'

Expected: HTTP 501 with PROVIDER_NOT_CONFIGURED — confirms token validation, agent verify, rate limiting, and normalization all succeeded.

Error mapping

ConditionHTTPCode
Missing Authorization401MISSING_TOKEN
Invalid or revoked PAT401INVALID_TOKEN
Agent not in org / path mismatch403INSUFFICIENT_PERMISSIONS
Rate limit exceeded429RATE_LIMIT_EXCEEDED
Auth gRPC timeout or unavailable503SERVICE_DEGRADED
No provider configured501PROVIDER_NOT_CONFIGURED

Full envelope: API errors.

Target path (Phase 2+)

Once provider adapters and context assembly ship, the synchronous path extends:

  1. Parallel context retrieval (40ms deadline) — directive from Redis, hot memories, recent session history
  2. Context assembly gRPC — rank and pack memories within model token budget
  3. Provider forward — augment messages, stream response to client while accumulating for async extraction
  4. Async side effects — ClickHouse trace, memory extraction job, session heartbeat update

Auth validation in Phase 2 may add an optional bloom filter + LRU cache (ADR-0011 deferral record); Phase 1 always calls gRPC.

Architecture decisions

TopicADR
Proxy → auth gRPC clientADR-0011
Token validation contractADR-0007
Permission bitmapADR-0009
Rate limit skeletonADR-0015
Agent identity verificationADR-0016
Request ID propagationADR-0017

Related

  • Proxy overview
  • Proxy authentication
  • Auth overview
  • Glossary

Was this page helpful?

Edit on GitHub

Last updated on

PreviousData modelNextOverview

On this page

  • Required headers
  • Lifecycle steps
  • Sequence diagram
  • Phase 1 probe
  • Error mapping
  • Target path (Phase 2+)
  • Architecture decisions
  • Related
0%