Milestone 2.1.2 — OpenAI HTTP Client (Non-Streaming)

Status: Planned
Goal: 2.1 — LLM provider abstraction and OpenAI forwarding
Phase: 2 — Single Provider End-to-End
Estimated effort: 3–4 days
ADR required: ADR-0023 — OpenAI client design and retry policy

Why This Milestone Exists

With the Provider interface defined in 2.1.1, this milestone delivers the first concrete implementation: the OpenAI HTTP client for non-streaming completions. "Non-streaming" first is deliberate — streaming adds buffering, SSE parsing, and dual-write complexity (milestone 2.1.3). Getting non-streaming right first gives a stable baseline to build streaming on top of.

What "production-grade" means for an HTTP client:

Configurable per-request timeouts with sane defaults
Connection pooling (not a new http.Client per request)
Retry with exponential backoff on transient failures (429, 503) — but NOT on client errors (400, 401, 403)
Request/response logging without logging content or API keys
All provider errors mapped to the IBEX error envelope before returning to the caller
API key read from environment at startup, never from database or request body

Non-Goals

Streaming (milestone 2.1.3)
Function calling / tool use (Phase 4)
Fine-tuned model support (Phase 4)
Azure OpenAI endpoint (Phase 4)

Branch

feature/m2-1-2-openai-non-streaming

PR Title

feat(proxy): OpenAI non-streaming HTTP client implementation (m2.1.2)

Prerequisites

2.1.1 merged

Deliverables

1. `packages/provider/openai/` — OpenAI provider implementation

// Package openai implements the provider.Provider interface for the OpenAI API.
// It supports gpt-4o, gpt-4o-mini, gpt-4-turbo, and gpt-3.5-turbo in Phase 2.
// Streaming is implemented in Phase 2 milestone 2.1.3.
package openai
 
// Config holds all OpenAI client configuration.
// Loaded from environment by the proxy's packages/config loader.
type Config struct {
    // APIKey is the OpenAI API key. Required. Marked secret — never logged.
    APIKey string `env:"OPENAI_API_KEY" required:"true" secret:"true"`
 
    // BaseURL is the OpenAI API base URL. Default: https://api.openai.com/v1
    BaseURL string `env:"OPENAI_BASE_URL" envDefault:"https://api.openai.com/v1"`
 
    // Timeout is the per-request timeout. Default: 120s (LLM calls can be slow).
    Timeout time.Duration `env:"OPENAI_REQUEST_TIMEOUT" envDefault:"120s"`
 
    // MaxRetries is the max number of retries on transient failures. Default: 3.
    MaxRetries int `env:"OPENAI_MAX_RETRIES" envDefault:"3"`
 
    // RetryBaseDelay is the base delay for exponential backoff. Default: 500ms.
    RetryBaseDelay time.Duration `env:"OPENAI_RETRY_BASE_DELAY" envDefault:"500ms"`
}
 
// Client implements provider.Provider for OpenAI.
type Client struct {
    cfg        Config
    httpClient *http.Client
    logger     *logger.Logger
    tracer     trace.Tracer
}
 
// New constructs an OpenAI Client with a shared http.Client (connection pooling).
// Call once at service startup; the Client is safe for concurrent use.
func New(cfg Config, log *logger.Logger, tracer trace.Tracer) *Client
 
func (c *Client) Name() string          { return "openai" }
func (c *Client) SupportedModels() []string {
    return []string{"gpt-4o", "gpt-4o-mini", "gpt-4-turbo", "gpt-3.5-turbo"}
}

Retry policy:

Retry on: HTTP 429, 500, 502, 503, 504 and network errors
Do NOT retry on: HTTP 400, 401, 403, 404 (client errors are not transient)
Backoff: exponential with jitter: delay = min(base * 2^attempt + jitter, 30s)
Max retries: from config (default 3); honour Retry-After header from OpenAI 429

Error mapping:

OpenAI status	IBEX error code	HTTP status
400	`INVALID_REQUEST`	400
401	`PROVIDER_UNAVAILABLE` (key invalid)	503
429	`RATE_LIMITED`	429 (with Retry-After)
500, 503	`PROVIDER_UNAVAILABLE`	503
Timeout	`PROVIDER_TIMEOUT`	504
Network error	`PROVIDER_UNAVAILABLE`	503

2. Request translation — `provider.Request` → OpenAI JSON body

// toOpenAIRequest translates a provider.Request to the OpenAI chat completions format.
// Directive injection: if req.SystemDirective is set, it is prepended as the first
// system message. Existing system messages follow it.
func toOpenAIRequest(req provider.Request) openAIRequest
 
// openAIRequest mirrors the OpenAI chat completions request schema.
// Only fields supported in Phase 2 are included; PassthroughFields are merged in.
type openAIRequest struct {
    Model       string           `json:"model"`
    Messages    []openAIMessage  `json:"messages"`
    MaxTokens   int              `json:"max_tokens,omitempty"`
    Temperature *float64         `json:"temperature,omitempty"`
    Stream      bool             `json:"stream"`
}

3. HTTP client configuration

// The http.Client is configured once and shared across all requests.
// Key settings:
httpClient := &http.Client{
    Timeout: cfg.Timeout,
    Transport: &http.Transport{
        MaxIdleConns:        100,
        MaxIdleConnsPerHost: 20,
        IdleConnTimeout:     90 * time.Second,
        TLSHandshakeTimeout: 10 * time.Second,
        DisableCompression:  false, // accept gzip from OpenAI
    },
}

Testing Requirements

All tests use a mock HTTP server (httptest.NewServer) — no real OpenAI API calls in unit or integration tests.

TestOpenAIClient_NonStreaming_Success: mock returns 200 with valid completion JSON → Response.Body readable, Usage populated
TestOpenAIClient_DirectiveInjection: request with SystemDirective set → first message in outgoing body has role="system" with directive content
TestOpenAIClient_Retry_On429: mock returns 429 twice then 200 → client retries, final result is success, retry count metric incremented
TestOpenAIClient_NoRetry_On400: mock returns 400 → client does NOT retry, returns INVALID_REQUEST error immediately
TestOpenAIClient_Timeout: mock delays beyond Timeout → PROVIDER_TIMEOUT error
TestOpenAIClient_NetworkError: server closed before response → PROVIDER_UNAVAILABLE
TestOpenAIClient_APIKeyNotInLogs: verify that the API key string does not appear in any log output during a request

Acceptance Criteria

packages/provider/openai.Client implements packages/provider.Provider
Non-streaming requests return correct provider response
Directive is injected as first system message when set
Retry policy retries on 429/5xx with exponential backoff + jitter
No retry on 4xx client errors
API key is never logged (enforced by test)
Connection pooling: single shared http.Client, not one per request
All provider errors mapped to IBEX error codes before returning to handler
ADR-0023 written and indexed