Overview

IBEX Harness is a distributed platform for persistent agent memory, intelligent context assembly, and behavioral consistency. Agent applications call a low-latency LLM proxy; the proxy authenticates every request through the auth service and (in later phases) injects memory and directives before forwarding to provider APIs.

System diagram

+-------------------------------------------------------------------------------------------------+
|  Agent applications    |   Planned — Phasef2astructure                     |                    |
|                        |                                                   |                    |
|                        |                                                   |                    |
| +--------------------+ |   +--------------------+     .------------------. |                    |
| |                    | |   |                    |     |                  | |                    |
| |                    | |   |                    |     |                  | |                    |
| |     SDKs / CLI     | |   | Background Workers |..+  |      MinIO       | |                    |
| |                    | |   |      (Celery)      |  :  |                  | |                    |
| |                    | |   |                    |  :  |                  | |                    |
| |                    | |   |                    |  :  |                  | |                    |
| +--------------------+ |   +--------------------+  :  '------------------' |                    |
|            |           |              :            :                       |                    |
|------------|-----------+              :            :                       |                    |
|            |                          :            :                       |                    |
|            |                          :            :                       |                    |
|            |                          :            +.....................+ |                    |
|            |                          :                                  : |                    |
|            |                          :                                  : |                    |
|            |                          :                                  : |                    |
|            v                          :                                  : |                    |
| +--------------------+                :                                  : |                    |
| |                    |                :                                  : |                    |
| |                    |                :                                  : |                    |
| |     LLM Proxy      |.....+----------+                                  +..+                   |
| |        (Go)        |     :          :                                    |:                   |
| |                    |     :          :                                    |:                   |
| +--------------------+     +..........:.........................+...........+..........+        |
|            |                          :                         :          |           :        |
|            |                          :                         :          |           :        |
|   gRPC ValidateToken                  :                      Phase 2       |     async traces   |
|            |                          :                         :          |           :        |
|            |            +.............+                         :          |           :        |
|            |            :             |                         :          |           :        |
|            |            :             |                         :          |           :        |
|            |            :             |                         :          |           :        |
|            |            :             |                         :          |           :        |
|            v            :             v                         v          |           v        |
| +--------------------+  :  .--------------------.     +------------------+ |   .--------------. |
| |                    |  :  |                    |     |                  | |   |              | |
| |                    |  :  |                    |     |                  | |   |              | |
| |    Auth Service    |  :  |       Redis        |     | Context Assembly | |   |  ClickHouse  | |
| |        (Go)        |  :  |                    |     |     (Python)     | |   |              | |
| |                    |  :  |                    |     |                  | |   |              | |
| |                    |  :  |                    |     |                  | |   |              | |
| +--------------------+  :  '--------------------'     +------------------+ |   '--------------' |
|            |            :                                       :          |                    |
|            |            :                                       :          |                    |
|            |            +.............+.........................+          |                    |
|            |                          :                                    |                    |
|            |                          :                                    |                    |
|            v                          v                                    |                    |
| .--------------------.     +--------------------+                          |                    |
| |                    |     |                    |                          |                    |
| |                    |     |                    |                          |                    |
| |     PostgreSQL     |<....|   Memory Service   |                          |                    |
| |                    |     |      (Python)      |                          |                    |
| |                    |     |                    |                          |                    |
| |                    |     |                    |                          |                    |
| '--------------------'     +--------------------+                          |                    |
|                                       :                                    |                    |
+---------------------------------------:---------------------------------------------------------+
|                                       :                                    |                     
|                                       :                                    |                     
|                                       :                                    |                     
| +--------------------+                :                                    |                     
| |                    |                :                                    |                     
| |                    |                :                                    |                     
| | Embedding Service  |<...............+                                    |                     
| |      (Python)      |                                                     |                     
| |                    |                                                     |                     
| +--------------------+                                                     |                     
|                                                                            |                     
+----------------------------------------------------------------------------+

The critical path is every LLM request: authenticate, enforce limits, assemble context, call the provider, stream the response. Target proxy overhead is under 20ms (p99) excluding provider latency. Memory extraction, drift detection, and analytics run asynchronously and must never block the agent's inference call.

Design principles

Performance first

The proxy and context assembly pipeline are optimized for millisecond budgets. Auth validation has a 50ms gRPC deadline; context retrieval targets a 40ms parallel deadline in Phase 2.

Security by default

org_id comes from the verified token, never the request body. Postgres RLS, Redis key namespacing, and permission bitmaps enforce isolation at every layer. Cross-tenant misses return 403, not 404.

Fail gracefully

Auth unreachable → fail closed (503). Context assembly timeout → directive-only context. Memory slow → hot-cache only. Rate limit Redis down → conservative fail-open with audit.

Observable everything

Structured JSON logs with request_id, Prometheus metrics on bounded labels, and OpenTelemetry traces across HTTP, gRPC, Redis, and database boundaries.

What runs synchronously vs async

Synchronous (blocks the agent)

Token validation, agent identity check, rate limiting, context retrieval, LLM provider call, and response streaming. These steps define user-perceived latency.

Asynchronous (never blocks)

Trace emission to ClickHouse, memory extraction jobs, behavioral fingerprinting, drift alerts, billing counters, and notification delivery. Failures here degrade analytics, not inference.

Phase 1 today

Only the first three synchronous steps are live: validate token, verify agent, rate limit. Provider forwarding and context injection return 501 until Phase 2.

Latency budgets

Operation	Budget
Auth `ValidateToken` gRPC	50ms
Redis rate limit check	5ms
Full proxy overhead (excl. LLM)	20ms p99
Context assembly (Phase 2)	50ms p95

Services — which components are live vs planned
Request lifecycle — step-by-step proxy flow
Glossary — PAT, RLS, org_id, and other terms

Was this page helpful?

System diagram

+-------------------------------------------------------------------------------------------------+
|  Agent applications    |   Planned — Phasef2astructure                     |                    |
|                        |                                                   |                    |
|                        |                                                   |                    |
| +--------------------+ |   +--------------------+     .------------------. |                    |
| |                    | |   |                    |     |                  | |                    |
| |                    | |   |                    |     |                  | |                    |
| |     SDKs / CLI     | |   | Background Workers |..+  |      MinIO       | |                    |
| |                    | |   |      (Celery)      |  :  |                  | |                    |
| |                    | |   |                    |  :  |                  | |                    |
| |                    | |   |                    |  :  |                  | |                    |
| +--------------------+ |   +--------------------+  :  '------------------' |                    |
|            |           |              :            :                       |                    |
|------------|-----------+              :            :                       |                    |
|            |                          :            :                       |                    |
|            |                          :            :                       |                    |
|            |                          :            +.....................+ |                    |
|            |                          :                                  : |                    |
|            |                          :                                  : |                    |
|            |                          :                                  : |                    |
|            v                          :                                  : |                    |
| +--------------------+                :                                  : |                    |
| |                    |                :                                  : |                    |
| |                    |                :                                  : |                    |
| |     LLM Proxy      |.....+----------+                                  +..+                   |
| |        (Go)        |     :          :                                    |:                   |
| |                    |     :          :                                    |:                   |
| +--------------------+     +..........:.........................+...........+..........+        |
|            |                          :                         :          |           :        |
|            |                          :                         :          |           :        |
|   gRPC ValidateToken                  :                      Phase 2       |     async traces   |
|            |                          :                         :          |           :        |
|            |            +.............+                         :          |           :        |
|            |            :             |                         :          |           :        |
|            |            :             |                         :          |           :        |
|            |            :             |                         :          |           :        |
|            |            :             |                         :          |           :        |
|            v            :             v                         v          |           v        |
| +--------------------+  :  .--------------------.     +------------------+ |   .--------------. |
| |                    |  :  |                    |     |                  | |   |              | |
| |                    |  :  |                    |     |                  | |   |              | |
| |    Auth Service    |  :  |       Redis        |     | Context Assembly | |   |  ClickHouse  | |
| |        (Go)        |  :  |                    |     |     (Python)     | |   |              | |
| |                    |  :  |                    |     |                  | |   |              | |
| |                    |  :  |                    |     |                  | |   |              | |
| +--------------------+  :  '--------------------'     +------------------+ |   '--------------' |
|            |            :                                       :          |                    |
|            |            :                                       :          |                    |
|            |            +.............+.........................+          |                    |
|            |                          :                                    |                    |
|            |                          :                                    |                    |
|            v                          v                                    |                    |
| .--------------------.     +--------------------+                          |                    |
| |                    |     |                    |                          |                    |
| |                    |     |                    |                          |                    |
| |     PostgreSQL     |<....|   Memory Service   |                          |                    |
| |                    |     |      (Python)      |                          |                    |
| |                    |     |                    |                          |                    |
| |                    |     |                    |                          |                    |
| '--------------------'     +--------------------+                          |                    |
|                                       :                                    |                    |
+---------------------------------------:---------------------------------------------------------+
|                                       :                                    |                     
|                                       :                                    |                     
|                                       :                                    |                     
| +--------------------+                :                                    |                     
| |                    |                :                                    |                     
| |                    |                :                                    |                     
| | Embedding Service  |<...............+                                    |                     
| |      (Python)      |                                                     |                     
| |                    |                                                     |                     
| +--------------------+                                                     |                     
|                                                                            |                     
+----------------------------------------------------------------------------+

Design principles

Performance first

The proxy and context assembly pipeline are optimized for millisecond budgets. Auth validation has a 50ms gRPC deadline; context retrieval targets a 40ms parallel deadline in Phase 2.

Security by default

org_id comes from the verified token, never the request body. Postgres RLS, Redis key namespacing, and permission bitmaps enforce isolation at every layer. Cross-tenant misses return 403, not 404.

Fail gracefully

Auth unreachable → fail closed (503). Context assembly timeout → directive-only context. Memory slow → hot-cache only. Rate limit Redis down → conservative fail-open with audit.

Observable everything

Structured JSON logs with request_id, Prometheus metrics on bounded labels, and OpenTelemetry traces across HTTP, gRPC, Redis, and database boundaries.

What runs synchronously vs async

Synchronous (blocks the agent)

Token validation, agent identity check, rate limiting, context retrieval, LLM provider call, and response streaming. These steps define user-perceived latency.

Asynchronous (never blocks)

Trace emission to ClickHouse, memory extraction jobs, behavioral fingerprinting, drift alerts, billing counters, and notification delivery. Failures here degrade analytics, not inference.

Phase 1 today

Only the first three synchronous steps are live: validate token, verify agent, rate limit. Provider forwarding and context injection return 501 until Phase 2.

Operation

Budget

Auth ValidateToken gRPC

50ms

Redis rate limit check

5ms

Full proxy overhead (excl. LLM)

20ms p99

Context assembly (Phase 2)

50ms p95