ibexharness
DocsBlogReleasesRoadmap
GitHub
ibexharness

Documentation

Architecture Decision RecordsADR-0002: Repository foundation bootstrapADR-0003: Branch protection and merge policyADR-0004: Protobuf and code generation policyADR-0005: Postgres migration strategyADR-0006: Auth protobuf contract (`ibex.auth.v1`)ADR-0007: Auth token validation implementationADR-0008: Security scanning and CI quality gatesADR-0009: Permission bitmap layoutADR-0010: Cryptography policyADR-0011: Proxy auth gRPC client and middlewareADR-0012: Proxy request normalization (OpenAI chat)ADR-0013: Proxy input validation and stable error envelopeADR-0014: Core domain migration sequencingADR-0015: Proxy rate limit skeleton (Phase 1)ADR-0016: Proxy agent identity verification (Phase 1)ADR-0017: Request ID and trace context strategy (Phase 1)ADR-0018: Graceful shutdown contract (Phase 1)ADR-0019: OpenTelemetry provider configuration (Phase 1)ADR-0020: Shared package boundaries — `packages/config` and `packages/apierror`ADR-0021: Prometheus Metric Catalog (Phase 1)ADR-0022: Health check contract (Phase 1)ADR-0023: Docs site architecture (Phase 1.5)
ADRs›ADR-0011: Proxy auth gRPC client and middleware
ADRs

ADR-0011: Proxy auth gRPC client and middleware

Architecture decision record 0011.

ADR-0011: Proxy auth gRPC client and middleware

  • Status: Accepted
  • Date: 2026-06-04
  • Authors: IBEX Harness team

Context

Milestone 1.1.3 delivered auth ValidateToken (ADR-0007). The proxy skeleton (services/proxy) exposes health/metrics only. Milestone 1.2.1 connects the proxy to auth so protected routes receive org_id and permission context before LLM normalization (1.2.2).

ARCHITECTURE.md describes a future bloom filter + LRU cache pipeline. Phase 2 optional milestone 2.2.1-auth-cache-bloom owns that work; v1 uses remote validation only (SECURITY.md §15: fail closed when validation cannot complete).

Decision

1) Transport and connection

  • gRPC to ibex.auth.v1.AuthService/ValidateToken (ADR-0006)
  • Single shared *grpc.ClientConn per proxy process; dial at startup; close on shutdown
  • Development: insecure credentials; production: mTLS (documented follow-up, not in v1)

2) Timeouts

  • Default per-validate timeout: 50ms (IBEX_AUTH_VALIDATE_TIMEOUT)
  • Use context.WithTimeout derived from the HTTP request context
  • Exceeded deadline → HTTP 503 SERVICE_DEGRADED (fail closed)

3) Bearer parsing

  • Read Authorization: Bearer <token>
  • Strip the Bearer prefix and following space; pass PAT wire string (ibex_pat_...) as ValidateTokenRequest.access_token
  • Missing header → HTTP 401 MISSING_TOKEN
  • Invalid/revoked → HTTP 401 INVALID_TOKEN (maps gRPC Unauthenticated)

4) Request context

After successful validation, attach to context.Context:

  • org_id, permissions (int64), optional agent_id, user_id, token_id

Handlers read via auth.FromContext(ctx).

5) Permission and tenant checks

  • Chat routes require permissions.ProxyChatCompletion (ADR-0009)
  • Path-scoped routes (e.g. /v1/orgs/{org_id}/...) compare path org_id to token org → 403 on mismatch

6) HTTP error mapping

Minimal stable JSON envelope in services/proxy/internal/errors/ (extended by milestone 1.2.3):

ConditionHTTPcode
Missing Authorization401MISSING_TOKEN
Invalid token401INVALID_TOKEN
Insufficient permissions / org mismatch403INSUFFICIENT_PERMISSIONS
Auth unreachable / timeout / internal503SERVICE_DEGRADED

7) Auth validation cache (deferred — deferral record)

What is deferred: Redis bloom filter for fast rejection, in-process LRU cache for validated claims, Redis validated-token cache.

Why Phase 1 skips it:

  1. Phase 1 exit prioritizes correctness and fail-closed behavior before latency optimization.
  2. SECURITY.md §15: deny access when validation cannot complete and no safe cached claims exist.
  3. Revocation propagation and cache invalidation are non-trivial (see TESTING_STRATEGY auth cache cases).
  4. Proxy has no Redis auth-cache wiring in Phase 1.
  5. Provider forwarding is not live yet; per-request gRPC validation is acceptable for integration.

Why not a partial cache: A negative cache without bloom risks false rejects; serving stale claims after revoke violates fail-closed unless a full invalidation story exists.

Extension point: auth.TokenValidator interface; GRPCValidator today; Phase 2 optional 2.2.1-auth-cache-bloom adds a CachingValidator decorator.

When implemented: Phase 2 optional milestone 2.2.1-auth-cache-bloom (after Goal 1.2, before/at provider scale).

Risk accepted: Every protected request hits auth gRPC (~50ms budget per ADR-0011) until 2.2.1.

8) Observability

  • Metrics: ibex_proxy_auth_validate_total, ibex_proxy_auth_validate_duration_seconds
  • Label: result only (ok, unauthenticated, error) — no org_id
  • Logs: may include org_id, token_id after success; never log bearer or access_token

9) Middleware order

metrics → logging → auth → handler (future: body limit, rate limit before handler)

10) Public routes (no auth)

/health, /ready, /metrics remain unauthenticated.

Consequences

Positive

  • Phase 1 exit criterion: proxy rejects unauthenticated traffic
  • Clean extension point for auth cache in Phase 2

Negative

  • Every protected request hits auth gRPC (latency until cache milestone)
  • 50ms budget may require tuning under load

References

  • Milestone 1.2.1
  • ADR-0006
  • ADR-0007

Was this page helpful?

Edit on GitHub

Last updated on

PreviousADR-0010: Cryptography policyNextADR-0012: Proxy request normalization (OpenAI chat)

On this page

  • Context
  • Decision
  • 1) Transport and connection
  • 2) Timeouts
  • 3) Bearer parsing
  • 4) Request context
  • 5) Permission and tenant checks
  • 6) HTTP error mapping
  • 7) Auth validation cache (deferred — deferral record)
  • 8) Observability
  • 9) Middleware order
  • 10) Public routes (no auth)
  • Consequences
  • Positive
  • Negative
  • References
0%