phase 2 single provider

The proxy needs to forward requests to an LLM provider, but the provider must not be hardcoded. The ARCHITECTURE.md explicitly calls out multiple providers (OpenAI, Anthropic, Azure OpenAI, Bedrock) as Phase 4+ targets. Building the forwarding logic directly against the OpenAI HTTP API in Phase 2 would require a large,

Milestone 2.1.1 — Provider Interface and Registry

Status: Planned
Goal: 2.1 — LLM provider abstraction and OpenAI forwarding
Phase: 2 — Single Provider End-to-End
Estimated effort: 2 days
ADR required: ADR-0022 — LLM provider abstraction design


Why This Milestone Exists

The proxy needs to forward requests to an LLM provider, but the provider must not be hardcoded. The ARCHITECTURE.md explicitly calls out multiple providers (OpenAI, Anthropic, Azure OpenAI, Bedrock) as Phase 4+ targets. Building the forwarding logic directly against the OpenAI HTTP API in Phase 2 would require a large, painful refactor in Phase 4. Instead, this milestone defines the Provider interface and a provider registry that all subsequent milestones (2.1.2, 2.1.3) implement against. Phase 4 adds new implementations without touching any existing code.

This is not premature abstraction. It is the correct design given the known roadmap.


Non-Goals

  • Any concrete provider implementation (2.1.2 does OpenAI non-streaming)
  • Provider capability negotiation (Phase 4)
  • Model routing (Phase 4)

Branch

feature/m2-1-1-provider-interface

PR Title

feat(proxy): LLM provider interface and registry design (m2.1.1)


ADR-0022 — Provider abstraction design

Write docs/adr/ADR-0022-llm-provider-abstraction.md documenting:

  • Why a single interface covers both streaming and non-streaming: The Provider interface has one method: Complete. The CompletionRequest struct has an Stream bool field. The implementation decides how to handle it. This avoids a split interface that requires callers to do a type assertion.
  • Why the interface does not include model routing: Model routing is a cross-cutting concern handled by the provider registry, not individual providers. A provider implements a fixed set of supported models; the registry selects the right provider.
  • Why API keys are not in the interface: Keys are an implementation detail of the provider constructor. The interface is key-agnostic.
  • Why responses are returned as io.ReadCloser not string: Streaming responses must stream. A string forces full buffering. io.ReadCloser works for both streaming (stream the bytes) and non-streaming (buffer and decode).

Deliverables

1. packages/provider — interface and request/response types

Go
// Package provider defines the LLM provider abstraction for IBEX Harness.
// All LLM communication goes through this interface.
//
// Phase 2: OpenAI implementation only.
// Phase 4: Anthropic, Azure OpenAI, AWS Bedrock implementations added.
package provider
 
import (
    "context"
    "io"
    "time"
)
 
// Request is a normalised LLM completion request.
// It is provider-agnostic; implementations translate to provider-specific format.
type Request struct {
    // Model is the model identifier as requested by the client.
    // Examples: "gpt-4o", "gpt-4o-mini", "claude-3-5-sonnet-20241022"
    Model string
 
    // Messages is the conversation history.
    Messages []Message
 
    // SystemDirective is the agent directive to inject as the first system message.
    // Empty string = no directive. Injection strategy is provider-specific.
    SystemDirective string
 
    // Stream, if true, requests a streaming (SSE) response.
    Stream bool
 
    // MaxTokens is the maximum number of completion tokens. 0 = provider default.
    MaxTokens int
 
    // Temperature controls randomness. Nil = provider default.
    Temperature *float64
 
    // PassthroughFields contains any client-supplied fields not explicitly modelled.
    // The provider implementation may forward these verbatim or drop them.
    // Never include: model, messages, stream (already modelled above).
    PassthroughFields map[string]any
}
 
// Message is a single turn in the conversation.
type Message struct {
    Role    string // "system", "user", "assistant", "tool"
    Content string
}
 
// Response is the outcome of a Complete call.
// For non-streaming requests, Body contains the complete provider JSON response.
// For streaming requests, Body is an SSE stream; the caller must read and forward it.
// The caller is responsible for closing Body.
type Response struct {
    // Body is the response body from the provider.
    // Non-streaming: full JSON (e.g. OpenAI chat completion object).
    // Streaming: SSE byte stream with data: {...}\n\n lines.
    Body io.ReadCloser
 
    // StatusCode is the provider HTTP response status code.
    // 200 on success; 4xx/5xx on provider errors (already translated to IBEX error).
    StatusCode int
 
    // Usage holds token counts extracted from the response.
    // For streaming: filled from the final data: [DONE] chunk when available.
    // May be nil if the provider does not return usage in the response.
    Usage *Usage
 
    // Latency is the time from sending the request to receiving the first byte.
    // This is the provider TTFB, not including any IBEX overhead.
    Latency time.Duration
 
    // ProviderRequestID is the request ID returned by the provider (e.g. X-Request-Id from OpenAI).
    // Used for provider-side debugging.
    ProviderRequestID string
}
 
// Usage holds LLM token consumption data.
type Usage struct {
    InputTokens  int
    OutputTokens int
    TotalTokens  int
}
 
// Provider is the interface all LLM provider implementations must satisfy.
// Implementations must be safe for concurrent use.
type Provider interface {
    // Complete sends a request to the LLM provider and returns the response.
    // For streaming requests, the caller reads from Response.Body until EOF.
    // For non-streaming requests, the caller reads the full body and decodes.
    //
    // The context carries the request deadline. Implementations must respect it.
    // If the provider returns a 4xx or 5xx error, Complete returns a ProviderError.
    Complete(ctx context.Context, req Request) (Response, error)
 
    // Name returns the provider identifier (e.g. "openai", "anthropic").
    // Used for metrics labels and trace attributes. Must be a static string.
    Name() string
 
    // SupportedModels returns the list of model IDs this provider handles.
    // The registry uses this to route requests to the correct provider.
    SupportedModels() []string
}
 
// ProviderError is returned by Complete when the provider returns a non-2xx response.
// It carries the provider's status code and original error body for translation.
type ProviderError struct {
    ProviderName   string
    StatusCode     int
    ProviderBody   []byte
    ProviderErrMsg string
}
 
func (e *ProviderError) Error() string {
    return fmt.Sprintf("provider %s returned %d: %s", e.ProviderName, e.StatusCode, e.ProviderErrMsg)
}

2. Provider registry

Go
// Registry maps model IDs to provider implementations.
// It is built once at service startup and is read-only thereafter.
type Registry struct {
    providers map[string]Provider // model ID → provider
}
 
// NewRegistry constructs a Registry from the given providers.
// Panics if two providers claim the same model ID.
func NewRegistry(providers ...Provider) *Registry
 
// For returns the provider for the given model ID.
// Returns (nil, ErrNoProviderForModel) if no provider supports the model.
func (r *Registry) For(model string) (Provider, error)
 
// ErrNoProviderForModel is returned when no registered provider supports a model.
var ErrNoProviderForModel = errors.New("no provider configured for this model")

Acceptance Criteria

  • packages/provider.Provider interface defined with Complete, Name, SupportedModels
  • packages/provider.Registry selects provider by model ID
  • ErrNoProviderForModel triggers 501 PROVIDER_NOT_CONFIGURED response in proxy handler
  • ADR-0022 written and indexed
  • Interface has no concrete OpenAI imports (pure abstraction)

Edit on GitHub

Last updated on

On this page

0%