With the `Provider` interface defined in 2.1.1, this milestone delivers the first concrete implementation: the OpenAI HTTP client for non-streaming completions. \"Non-streaming\" first is deliberate — streaming adds buffering, SSE parsing, and dual-write complexity (milestone 2.1.3). Getting non-streaming right first giv
Milestone 2.1.2 — OpenAI HTTP Client (Non-Streaming)
Status: Planned
Goal: 2.1 — LLM provider abstraction and OpenAI forwarding
Phase: 2 — Single Provider End-to-End
Estimated effort: 3–4 days
ADR required: ADR-0023 — OpenAI client design and retry policy
Why This Milestone Exists
With the Provider interface defined in 2.1.1, this milestone delivers the first concrete implementation: the OpenAI HTTP client for non-streaming completions. "Non-streaming" first is deliberate — streaming adds buffering, SSE parsing, and dual-write complexity (milestone 2.1.3). Getting non-streaming right first gives a stable baseline to build streaming on top of.
What "production-grade" means for an HTTP client:
- Configurable per-request timeouts with sane defaults
- Connection pooling (not a new
http.Clientper request) - Retry with exponential backoff on transient failures (429, 503) — but NOT on client errors (400, 401, 403)
- Request/response logging without logging content or API keys
- All provider errors mapped to the IBEX error envelope before returning to the caller
- API key read from environment at startup, never from database or request body
Non-Goals
- Streaming (milestone 2.1.3)
- Function calling / tool use (Phase 4)
- Fine-tuned model support (Phase 4)
- Azure OpenAI endpoint (Phase 4)
Branch
feature/m2-1-2-openai-non-streaming
PR Title
feat(proxy): OpenAI non-streaming HTTP client implementation (m2.1.2)
Prerequisites
- 2.1.1 merged
Deliverables
1. packages/provider/openai/ — OpenAI provider implementation
// Package openai implements the provider.Provider interface for the OpenAI API.
// It supports gpt-4o, gpt-4o-mini, gpt-4-turbo, and gpt-3.5-turbo in Phase 2.
// Streaming is implemented in Phase 2 milestone 2.1.3.
package openai
// Config holds all OpenAI client configuration.
// Loaded from environment by the proxy's packages/config loader.
type Config struct {
// APIKey is the OpenAI API key. Required. Marked secret — never logged.
APIKey string `env:"OPENAI_API_KEY" required:"true" secret:"true"`
// BaseURL is the OpenAI API base URL. Default: https://api.openai.com/v1
BaseURL string `env:"OPENAI_BASE_URL" envDefault:"https://api.openai.com/v1"`
// Timeout is the per-request timeout. Default: 120s (LLM calls can be slow).
Timeout time.Duration `env:"OPENAI_REQUEST_TIMEOUT" envDefault:"120s"`
// MaxRetries is the max number of retries on transient failures. Default: 3.
MaxRetries int `env:"OPENAI_MAX_RETRIES" envDefault:"3"`
// RetryBaseDelay is the base delay for exponential backoff. Default: 500ms.
RetryBaseDelay time.Duration `env:"OPENAI_RETRY_BASE_DELAY" envDefault:"500ms"`
}
// Client implements provider.Provider for OpenAI.
type Client struct {
cfg Config
httpClient *http.Client
logger *logger.Logger
tracer trace.Tracer
}
// New constructs an OpenAI Client with a shared http.Client (connection pooling).
// Call once at service startup; the Client is safe for concurrent use.
func New(cfg Config, log *logger.Logger, tracer trace.Tracer) *Client
func (c *Client) Name() string { return "openai" }
func (c *Client) SupportedModels() []string {
return []string{"gpt-4o", "gpt-4o-mini", "gpt-4-turbo", "gpt-3.5-turbo"}
}Retry policy:
- Retry on: HTTP 429, 500, 502, 503, 504 and network errors
- Do NOT retry on: HTTP 400, 401, 403, 404 (client errors are not transient)
- Backoff: exponential with jitter:
delay = min(base * 2^attempt + jitter, 30s) - Max retries: from config (default 3); honour
Retry-Afterheader from OpenAI 429
Error mapping:
| OpenAI status | IBEX error code | HTTP status |
|---|---|---|
| 400 | INVALID_REQUEST | 400 |
| 401 | PROVIDER_UNAVAILABLE (key invalid) | 503 |
| 429 | RATE_LIMITED | 429 (with Retry-After) |
| 500, 503 | PROVIDER_UNAVAILABLE | 503 |
| Timeout | PROVIDER_TIMEOUT | 504 |
| Network error | PROVIDER_UNAVAILABLE | 503 |
2. Request translation — provider.Request → OpenAI JSON body
// toOpenAIRequest translates a provider.Request to the OpenAI chat completions format.
// Directive injection: if req.SystemDirective is set, it is prepended as the first
// system message. Existing system messages follow it.
func toOpenAIRequest(req provider.Request) openAIRequest
// openAIRequest mirrors the OpenAI chat completions request schema.
// Only fields supported in Phase 2 are included; PassthroughFields are merged in.
type openAIRequest struct {
Model string `json:"model"`
Messages []openAIMessage `json:"messages"`
MaxTokens int `json:"max_tokens,omitempty"`
Temperature *float64 `json:"temperature,omitempty"`
Stream bool `json:"stream"`
}3. HTTP client configuration
// The http.Client is configured once and shared across all requests.
// Key settings:
httpClient := &http.Client{
Timeout: cfg.Timeout,
Transport: &http.Transport{
MaxIdleConns: 100,
MaxIdleConnsPerHost: 20,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
DisableCompression: false, // accept gzip from OpenAI
},
}Testing Requirements
All tests use a mock HTTP server (httptest.NewServer) — no real OpenAI API calls in unit or integration tests.
TestOpenAIClient_NonStreaming_Success: mock returns 200 with valid completion JSON →Response.Bodyreadable,UsagepopulatedTestOpenAIClient_DirectiveInjection: request withSystemDirectiveset → first message in outgoing body hasrole="system"with directive contentTestOpenAIClient_Retry_On429: mock returns 429 twice then 200 → client retries, final result is success, retry count metric incrementedTestOpenAIClient_NoRetry_On400: mock returns 400 → client does NOT retry, returnsINVALID_REQUESTerror immediatelyTestOpenAIClient_Timeout: mock delays beyondTimeout→PROVIDER_TIMEOUTerrorTestOpenAIClient_NetworkError: server closed before response →PROVIDER_UNAVAILABLETestOpenAIClient_APIKeyNotInLogs: verify that the API key string does not appear in any log output during a request
Acceptance Criteria
-
packages/provider/openai.Clientimplementspackages/provider.Provider - Non-streaming requests return correct provider response
- Directive is injected as first system message when set
- Retry policy retries on 429/5xx with exponential backoff + jitter
- No retry on 4xx client errors
- API key is never logged (enforced by test)
- Connection pooling: single shared
http.Client, not one per request - All provider errors mapped to IBEX error codes before returning to handler
- ADR-0023 written and indexed
Last updated on