ADR-0016: Proxy agent identity verification (Phase 1)
Architecture decision record 0016.
ADR-0016: Proxy agent identity verification (Phase 1)
- Status: Accepted
- Date: 2026-06-06
- Authors: IBEX Harness team
Context
Milestone 1.2.3 extracts X-IBEX-Agent-ID as a parseable UUID but does not verify that the agent belongs to the authenticated organization. An Org A token combined with Org B's agent UUID creates a cross-tenant confusion risk. Auth service M1.1.7 already implements ValidateAgent gRPC; the proxy must call it on every protected request.
Decision
1) Enforcement point
Agent ownership is verified in the proxy middleware, not in downstream memory/context services. Rationale:
- Single enforcement point before any handler runs
- Latency cost is already paid for token validation on the same gRPC connection
- Downstream services receive a verified
AgentRecordin context
2) gRPC to auth, not direct DB
Proxy calls AuthService.ValidateAgent via the existing *grpc.ClientConn. Proxy must not query Postgres for agents.
3) Bearer forwarding
ValidateAgent requires caller authentication (auth unary interceptor). Proxy re-parses the Authorization header and forwards authorization: Bearer <token> as outgoing gRPC metadata. Tokens are never logged.
4) org_id source
org_id passed to ValidateAgent comes from auth.FromContext (token claims), never from the request body, URL, or X-IBEX-Agent-ID header alone.
5) Required header
X-IBEX-Agent-ID is required on all protected Phase-1 routes:
GET /v1/internal/auth-probeGET /v1/orgs/{org_id}/auth-probePOST /v1/chat/completions
6) Anti-enumeration
Auth returns PERMISSION_DENIED (not NOT_FOUND) for cross-org and missing agents. Proxy maps to 403 AGENT_NOT_AUTHORIZED — never 404.
Inactive agents (paused, suspended, archived) return PERMISSION_DENIED with message "agent is not active". Proxy maps to 403 AGENT_SUSPENDED. Other PERMISSION_DENIED cases map to AGENT_NOT_AUTHORIZED.
7) Fail-closed vs fail-open
| Control | Failure behavior | HTTP | code |
|---|---|---|---|
| Token validate (M1.2.1) | Fail closed | 503 | SERVICE_DEGRADED (ADR-0011; unchanged) |
| Agent verify (M1.2.5) | Fail closed | 503 | AUTH_UNAVAILABLE |
| Rate limit (M1.2.4) | Fail open | — | — |
Agent identity is a security control; failing open during auth downtime would allow cross-tenant agent confusion.
8) HTTP mapping (agent middleware)
| Condition | HTTP | code |
|---|---|---|
| Header absent | 400 | MISSING_AGENT_ID |
| Malformed UUID | 400 | VALIDATION_ERROR + field_errors |
| Cross-org / not found | 403 | AGENT_NOT_AUTHORIZED |
| Inactive agent | 403 | AGENT_SUSPENDED |
| gRPC timeout / transport error | 503 | AUTH_UNAVAILABLE |
9) Middleware order
Amends ADR-0013 §8 and ADR-0015 §6:
POST /v1/chat/completions:
bodyLimit → contentType → auth → agentVerify → rateLimit → handler
GET /v1/internal/auth-probe:
auth → agentVerify → rateLimit → handler
GET /v1/orgs/{org_id}/auth-probe:
pathOrgUUID → auth → agentVerify → rateLimit → handler10) Context
Verified agent stored as AgentRecord{ID, OrgID, Status} via AgentFromContext. Chat handler no longer validates agent header (middleware owns it).
11) Phase 2 caching (deferred)
Same pattern as token validation: bloom filter → LRU → gRPC fallback (milestone 2.2.1).
12) Configuration
Reuses IBEX_AUTH_VALIDATE_TIMEOUT (default 50ms) for ValidateAgent per-call deadline on the shared auth gRPC client.
Consequences
Positive
- Closes cross-tenant agent confusion attack
- Consistent with multi-tenant security rules (403 not 404)
- Reuses auth service as source of truth
Negative
- Second gRPC call per protected request (mitigated by shared connection; Phase 2 cache planned)
AGENT_SUSPENDEDmapping relies on stable auth gRPC message"agent is not active"until richer error details exist
References
Was this page helpful?
Last updated on