phase 2 single provider

With the `Provider` interface defined in 2.1.1, this milestone delivers the first concrete implementation: the OpenAI HTTP client for non-streaming completions. \"Non-streaming\" first is deliberate — streaming adds buffering, SSE parsing, and dual-write complexity (milestone 2.1.3). Getting non-streaming right first giv

Milestone 2.1.2 — OpenAI HTTP Client (Non-Streaming)

Status: Planned
Goal: 2.1 — LLM provider abstraction and OpenAI forwarding
Phase: 2 — Single Provider End-to-End
Estimated effort: 3–4 days
ADR required: ADR-0023 — OpenAI client design and retry policy


Why This Milestone Exists

With the Provider interface defined in 2.1.1, this milestone delivers the first concrete implementation: the OpenAI HTTP client for non-streaming completions. "Non-streaming" first is deliberate — streaming adds buffering, SSE parsing, and dual-write complexity (milestone 2.1.3). Getting non-streaming right first gives a stable baseline to build streaming on top of.

What "production-grade" means for an HTTP client:

  • Configurable per-request timeouts with sane defaults
  • Connection pooling (not a new http.Client per request)
  • Retry with exponential backoff on transient failures (429, 503) — but NOT on client errors (400, 401, 403)
  • Request/response logging without logging content or API keys
  • All provider errors mapped to the IBEX error envelope before returning to the caller
  • API key read from environment at startup, never from database or request body

Non-Goals

  • Streaming (milestone 2.1.3)
  • Function calling / tool use (Phase 4)
  • Fine-tuned model support (Phase 4)
  • Azure OpenAI endpoint (Phase 4)

Branch

feature/m2-1-2-openai-non-streaming

PR Title

feat(proxy): OpenAI non-streaming HTTP client implementation (m2.1.2)


Prerequisites


Deliverables

1. packages/provider/openai/ — OpenAI provider implementation

Go
// Package openai implements the provider.Provider interface for the OpenAI API.
// It supports gpt-4o, gpt-4o-mini, gpt-4-turbo, and gpt-3.5-turbo in Phase 2.
// Streaming is implemented in Phase 2 milestone 2.1.3.
package openai
 
// Config holds all OpenAI client configuration.
// Loaded from environment by the proxy's packages/config loader.
type Config struct {
    // APIKey is the OpenAI API key. Required. Marked secret — never logged.
    APIKey string `env:"OPENAI_API_KEY" required:"true" secret:"true"`
 
    // BaseURL is the OpenAI API base URL. Default: https://api.openai.com/v1
    BaseURL string `env:"OPENAI_BASE_URL" envDefault:"https://api.openai.com/v1"`
 
    // Timeout is the per-request timeout. Default: 120s (LLM calls can be slow).
    Timeout time.Duration `env:"OPENAI_REQUEST_TIMEOUT" envDefault:"120s"`
 
    // MaxRetries is the max number of retries on transient failures. Default: 3.
    MaxRetries int `env:"OPENAI_MAX_RETRIES" envDefault:"3"`
 
    // RetryBaseDelay is the base delay for exponential backoff. Default: 500ms.
    RetryBaseDelay time.Duration `env:"OPENAI_RETRY_BASE_DELAY" envDefault:"500ms"`
}
 
// Client implements provider.Provider for OpenAI.
type Client struct {
    cfg        Config
    httpClient *http.Client
    logger     *logger.Logger
    tracer     trace.Tracer
}
 
// New constructs an OpenAI Client with a shared http.Client (connection pooling).
// Call once at service startup; the Client is safe for concurrent use.
func New(cfg Config, log *logger.Logger, tracer trace.Tracer) *Client
 
func (c *Client) Name() string          { return "openai" }
func (c *Client) SupportedModels() []string {
    return []string{"gpt-4o", "gpt-4o-mini", "gpt-4-turbo", "gpt-3.5-turbo"}
}

Retry policy:

  • Retry on: HTTP 429, 500, 502, 503, 504 and network errors
  • Do NOT retry on: HTTP 400, 401, 403, 404 (client errors are not transient)
  • Backoff: exponential with jitter: delay = min(base * 2^attempt + jitter, 30s)
  • Max retries: from config (default 3); honour Retry-After header from OpenAI 429

Error mapping:

OpenAI statusIBEX error codeHTTP status
400INVALID_REQUEST400
401PROVIDER_UNAVAILABLE (key invalid)503
429RATE_LIMITED429 (with Retry-After)
500, 503PROVIDER_UNAVAILABLE503
TimeoutPROVIDER_TIMEOUT504
Network errorPROVIDER_UNAVAILABLE503

2. Request translation — provider.Request → OpenAI JSON body

Go
// toOpenAIRequest translates a provider.Request to the OpenAI chat completions format.
// Directive injection: if req.SystemDirective is set, it is prepended as the first
// system message. Existing system messages follow it.
func toOpenAIRequest(req provider.Request) openAIRequest
 
// openAIRequest mirrors the OpenAI chat completions request schema.
// Only fields supported in Phase 2 are included; PassthroughFields are merged in.
type openAIRequest struct {
    Model       string           `json:"model"`
    Messages    []openAIMessage  `json:"messages"`
    MaxTokens   int              `json:"max_tokens,omitempty"`
    Temperature *float64         `json:"temperature,omitempty"`
    Stream      bool             `json:"stream"`
}

3. HTTP client configuration

Go
// The http.Client is configured once and shared across all requests.
// Key settings:
httpClient := &http.Client{
    Timeout: cfg.Timeout,
    Transport: &http.Transport{
        MaxIdleConns:        100,
        MaxIdleConnsPerHost: 20,
        IdleConnTimeout:     90 * time.Second,
        TLSHandshakeTimeout: 10 * time.Second,
        DisableCompression:  false, // accept gzip from OpenAI
    },
}

Testing Requirements

All tests use a mock HTTP server (httptest.NewServer) — no real OpenAI API calls in unit or integration tests.

  • TestOpenAIClient_NonStreaming_Success: mock returns 200 with valid completion JSON → Response.Body readable, Usage populated
  • TestOpenAIClient_DirectiveInjection: request with SystemDirective set → first message in outgoing body has role="system" with directive content
  • TestOpenAIClient_Retry_On429: mock returns 429 twice then 200 → client retries, final result is success, retry count metric incremented
  • TestOpenAIClient_NoRetry_On400: mock returns 400 → client does NOT retry, returns INVALID_REQUEST error immediately
  • TestOpenAIClient_Timeout: mock delays beyond TimeoutPROVIDER_TIMEOUT error
  • TestOpenAIClient_NetworkError: server closed before response → PROVIDER_UNAVAILABLE
  • TestOpenAIClient_APIKeyNotInLogs: verify that the API key string does not appear in any log output during a request

Acceptance Criteria

  • packages/provider/openai.Client implements packages/provider.Provider
  • Non-streaming requests return correct provider response
  • Directive is injected as first system message when set
  • Retry policy retries on 429/5xx with exponential backoff + jitter
  • No retry on 4xx client errors
  • API key is never logged (enforced by test)
  • Connection pooling: single shared http.Client, not one per request
  • All provider errors mapped to IBEX error codes before returning to handler
  • ADR-0023 written and indexed

Edit on GitHub

Last updated on

On this page

0%