Phase 2: Single Provider E2E
Phase 2: Single Provider E2E — planned. Estimated 3–5 weeks. Depends on phase 1 exit criteria.
Phase 2: Single Provider E2E
Status: Planned
Estimated duration: 3–5 weeks
Depends on: Phase 1 exit criteria
Current milestone: 2.3.1 Directive migrations (after Phase 1 completes)
Purpose
Phase 2 transforms IBEX Harness from a security-complete authentication and rate limiting layer (Phase 1) into a working AI proxy — one that receives a real LLM request, optionally enriches it with an agent directive, forwards it to OpenAI, streams the response back to the caller, and emits an async trace to ClickHouse for analytics.
At the end of Phase 2, a customer can point their OpenAI SDK at the IBEX proxy endpoint, configure an agent directive, and every request will be transparently enriched, authenticated, rate-limited, traced, and forwarded. This is the first deployable product milestone.
Milestones
| ID | Milestone | Status |
|---|---|---|
| 2.1.1 | Provider interface and registry | Planned |
| 2.1.2 | OpenAI non-streaming client | Planned |
| 2.1.3 | OpenAI streaming forwarder | Planned |
| 2.1.4 | Provider routing middleware | Planned |
| 2.1.5 | Provider error mapping | Planned |
| 2.2.1 | Auth cache bloom + LRU | Planned |
| 2.2.2 | Token revocation propagation | Planned |
| 2.3.1 | Directive migrations | Planned |
| 2.3.2 | Directive resolver | Planned |
| 2.3.3 | System prompt injection | Planned |
| 2.4.1 | Sessions and checkpoints migrations | Planned |
| 2.4.2 | Session store | Planned |
| 2.4.3 | Proxy session lifecycle | Planned |
| 2.5.1 | ClickHouse schema | Planned |
| 2.5.2 | ClickHouse client | Planned |
| 2.5.3 | Async trace emitter | Planned |
| 2.6.1 | Latency benchmark | Planned |
| 2.6.2 | Phase 2 exit gate | Planned |
What Phase 2 is not
Phase 2 intentionally excludes:
- Memory injection — agents won't have persistent memory yet (Phase 3). Directives are static.
- Multi-provider routing — only OpenAI in Phase 2 (Phase 4 adds Anthropic, Bedrock, etc.)
- Dashboard or API server — operator UI is Phase 3
- Embedding service — not needed without memory
- Hierarchical rate limiting — Lua-script atomic rate limiter with per-agent limits (Phase 4)
- Billing integration — token counting is tracked but not charged (Phase 3)
Critical path
Client SDK
│
▼ POST /v1/chat/completions
┌─────────────────────────────────────────────────────────┐
│ IBEX Proxy │
│ RequestID → Auth (LRU cache) → AgentVerify → RateLimit │
│ → DirectiveResolver → PromptInjector │
│ → OpenAI HTTP Client │
│ → SSE Stream Forward (dual-write) │
│ → [async] Session checkpoint + Trace emit │
└─────────────────────────────────────────────────────────┘
│
▼ SSE stream
Client SDKLatency budget at Phase 2:
| Stage | Budget |
|---|---|
| Auth (LRU cache hit) | <1ms |
| Auth (gRPC fallback on miss) | <50ms |
| Agent verification (LRU cache hit) | <1ms |
| Rate limit (Redis INCR) | <5ms |
| Directive resolve (Redis cache hit) | <2ms |
| Prompt injection | <0.5ms |
| Total proxy overhead (non-provider) | <20ms (p99 target) |
| OpenAI TTFB | varies (not in our control) |
Entry criteria
- Phase 1 complete (including M1.5.1 security gate)
- Phase 1.5 docs site launched at
docs.ibexharness.com(docs-first sequencing) - Local compose stack healthy (Postgres, Redis; ClickHouse added in 2.5.1)
Exit criteria
Phase 2 is complete when ALL of the following are true:
POST /v1/chat/completionswith a valid PAT and agent directive returns a real OpenAI completion- Streaming mode: first bytes arrive at client within 100ms of provider TTFB
- Proxy overhead (non-provider time) is <20ms at p99 under 100 concurrent requests
- Directive is correctly prepended to the system message in every request
- Every request creates or updates a session checkpoint in Postgres
- Every completed request emits a trace to ClickHouse within 500ms of response completion
- Token revocation is reflected in the auth cache within 5 seconds
- All Phase 1 security tests (milestone 1.5.1) still pass with no regressions
make e2e-smoke(Phase 2 smoke test with OpenAI sandbox key) exits 0
Execution order
2.3.1 (directive migrations)
→ 2.4.1 (session migrations)
→ 2.5.1 (ClickHouse schema)
[these three can run in parallel — all are schema/infra]
2.2.1 (auth LRU cache) → 2.2.2 (revocation propagation)
2.3.2 (directive resolver) → 2.3.3 (prompt injection)
2.1.1 (provider interface)
→ 2.1.2 (OpenAI non-streaming)
→ 2.1.3 (OpenAI streaming)
→ 2.1.4 (provider routing)
→ 2.1.5 (provider error mapping)
2.4.2 (session store) → 2.4.3 (proxy session lifecycle)
2.5.2 (ClickHouse client) → 2.5.3 (async trace emitter)
[all above merged]
2.6.1 (latency benchmark) → 2.6.2 (Phase 2 exit gate)Documents
- goals.md — Goals 2.1–2.6
- milestones/ — PR-sized work units
- decisions.md — Phase-local decision log
- risks.md — Risks and mitigations
Goal overview
| Goal | Focus |
|---|---|
| 2.1 | Provider abstraction and OpenAI forwarding |
| 2.2 | Auth LRU + bloom cache |
| 2.3 | Directive resolve and inject |
| 2.4 | Sessions and checkpoints |
| 2.5 | ClickHouse async traces |
| 2.6 | Latency benchmark and exit gate |
Next phase
When exit criteria are met, begin Phase 3: Memory Engine and Operator Platform.
Last updated on