ibexharness
DocsBlogReleasesRoadmap
GitHub
ibexharness

Documentation

Architecture Decision RecordsADR-0002: Repository foundation bootstrapADR-0003: Branch protection and merge policyADR-0004: Protobuf and code generation policyADR-0005: Postgres migration strategyADR-0006: Auth protobuf contract (`ibex.auth.v1`)ADR-0007: Auth token validation implementationADR-0008: Security scanning and CI quality gatesADR-0009: Permission bitmap layoutADR-0010: Cryptography policyADR-0011: Proxy auth gRPC client and middlewareADR-0012: Proxy request normalization (OpenAI chat)ADR-0013: Proxy input validation and stable error envelopeADR-0014: Core domain migration sequencingADR-0015: Proxy rate limit skeleton (Phase 1)ADR-0016: Proxy agent identity verification (Phase 1)ADR-0017: Request ID and trace context strategy (Phase 1)ADR-0018: Graceful shutdown contract (Phase 1)ADR-0019: OpenTelemetry provider configuration (Phase 1)ADR-0020: Shared package boundaries — `packages/config` and `packages/apierror`ADR-0021: Prometheus Metric Catalog (Phase 1)ADR-0022: Health check contract (Phase 1)ADR-0023: Docs site architecture (Phase 1.5)
ADRs›ADR-0019: OpenTelemetry provider configuration (Phase 1)
ADRs

ADR-0019: OpenTelemetry provider configuration (Phase 1)

Architecture decision record 0019.

ADR-0019: OpenTelemetry provider configuration (Phase 1)

  • Status: Accepted
  • Date: 2026-06-07
  • Authors: IBEX Harness team

Context

M1.3.3 delivered packages/logger with trace_id from trace.SpanFromContext, but no tracer provider was initialized — trace_id was always empty. ADR-0017 reserved synthetic X-Trace-ID (UUID v4) until OTel spans exist.

Phase 1 requires distributed tracing infrastructure without mandating a running Jaeger/Tempo collector. CI must assert spans via in-process recorders.

Decision

1) Shared package packages/telemetry

Both services/auth and services/proxy initialize OTel via telemetry.Init(ctx, cfg) in main.go:

  • TracerProvider and MeterProvider configured once at startup
  • Global propagator: W3C tracecontext + baggage
  • Tracers obtained via providers.TracerProvider.Tracer("ibex-<service>") — no otel.Tracer() in service code (29-ibex-packages.mdc)

SDK: go.opentelemetry.io/otel v1.x (pinned in go.mod).

2) Resource attributes

Every span carries resource attributes:

AttributeSource
service.nameOTEL_SERVICE_NAME (fallback IBEX_SERVICE_NAME)
service.versionOTEL_SERVICE_VERSION (default dev)
deployment.environmentOTEL_DEPLOYMENT_ENVIRONMENT (fallback IBEX_ENV, default development)

3) Exporter selection

ConditionBehaviour
OTEL_EXPORTER_OTLP_ENDPOINT setOTLP gRPC batch exporter
OTEL_EXPORTER_OTLP_ENDPOINT emptyNo exporter — spans created for context propagation only (development/CI)

Phase 1 does not require an external collector. Unit tests use sdktrace/tracetest in-memory exporter.

4) Sampling

Phase 1 uses ParentBased(TraceIDRatioBased(OTEL_SAMPLE_RATIO)) with default ratio 0.01.

Unconditional error-priority export sampling (100% of 5xx traces) requires tail-based sampling at the collector — deferred to Phase 2 per milestone non-goals. HTTP span middleware sets span status ERROR on HTTP status ≥ 500 on sampled spans.

5) HTTP span middleware

telemetry.SpanMiddleware(tracer) creates server spans named {method} {route_template} (e.g. POST /v1/chat/completions). Route template from http.Request.Pattern (Go 1.22+ ServeMux), never raw URL paths.

Attributes: http.method, http.route, http.status_code, http.request_content_length, ibex.request_id.

Middleware order (proxy):

metrics → RequestContext → Span → ResponseHeaders → logging → mux

Protected route auth middleware runs inside mux after span creation.

6) gRPC client trace propagation

Proxy auth gRPC client uses otelgrpc.UnaryClientInterceptor() chained after RequestIDUnaryInterceptor. Auth gRPC server interceptors are out of scope until auth gains a full server test suite.

7) Synthetic trace ID retired (ADR-0017 amendment)

RequestContextMiddleware no longer generates synthetic UUID v4 trace IDs. X-Trace-ID response header is set from OTel SpanContext.TraceID() after span middleware runs.

Request ID (packages/reqid) remains the internal log correlation token.

8) Shutdown

providers.Shutdown is registered first on packages/shutdown.Coordinator (ADR-0018) to flush OTLP exporters before HTTP/gRPC drain.

Environment variables

VariableRequiredDefault
OTEL_SERVICE_NAMEYes*IBEX_SERVICE_NAME
OTEL_SERVICE_VERSIONNodev
OTEL_DEPLOYMENT_ENVIRONMENTNoIBEX_ENV or development
OTEL_EXPORTER_OTLP_ENDPOINTNo(empty — noop)
OTEL_SAMPLE_RATIONo0.01

*Required directly or via IBEX_SERVICE_NAME fallback.

Consequences

Positive

  • packages/logger trace_id populated on every HTTP request
  • W3C traceparent propagation to auth gRPC calls from proxy
  • No external collector required for CI or local dev
  • M1.3.2 Prometheus migration can adopt initialized meter provider

Negative

  • 99% of successful requests not exported at default sampling (by design)
  • Auth HTTP spans lack ibex.request_id until auth gains reqid middleware

References

  • Milestone 1.3.1
  • ADR-0017
  • ADR-0018
  • 18-observability.mdc

Was this page helpful?

Edit on GitHub

Last updated on

PreviousADR-0018: Graceful shutdown contract (Phase 1)NextADR-0020: Shared package boundaries — `packages/config` and `packages/apierror`

On this page

  • Context
  • Decision
  • 1) Shared package packages/telemetry
  • 2) Resource attributes
  • 3) Exporter selection
  • 4) Sampling
  • 5) HTTP span middleware
  • 6) gRPC client trace propagation
  • 7) Synthetic trace ID retired (ADR-0017 amendment)
  • 8) Shutdown
  • Environment variables
  • Consequences
  • Positive
  • Negative
  • References
0%