phase 2 single provider

Milestone 2.2.1 introduces an LRU cache with a 30-second maximum TTL. This means a revoked token can be used for up to 30 seconds after revocation. For Phase 2, the SLA is \"revoked tokens rejected within 5 seconds.\" To meet this, the auth service must notify all proxy instances immediately when a token is revoked, so t

Milestone 2.2.2 — Token Revocation Propagation via Redis Pub/Sub

Status: Planned
Goal: 2.2 — Auth performance cache
Phase: 2 — Single Provider End-to-End
Estimated effort: 2 days
ADR required: ADR-0026 — Revocation propagation design


Why This Milestone Exists

Milestone 2.2.1 introduces an LRU cache with a 30-second maximum TTL. This means a revoked token can be used for up to 30 seconds after revocation. For Phase 2, the SLA is "revoked tokens rejected within 5 seconds." To meet this, the auth service must notify all proxy instances immediately when a token is revoked, so they can invalidate their LRU entries.

Redis pub/sub is the correct mechanism: it is already a required dependency (rate limiting), it supports fan-out to multiple proxy instances, and it does not require any new infrastructure.

The auth service publishes a revocation event when RevokeToken is called. Every proxy instance subscribes and calls CachingValidator.Invalidate when it receives the event.


Branch

feature/m2-2-2-revocation-propagation

PR Title

feat(proxy,auth): token revocation propagation via Redis pub/sub (m2.2.2)


Deliverables

1. Revocation event schema

Go
// RevocationEvent is published to the Redis channel on token revocation.
// Channel: ibex:token:revocations (global, not org-scoped: proxy doesn't know org_id from token alone)
type RevocationEvent struct {
    Version    int       `json:"v"`           // schema version, currently 1
    TokenHash  string    `json:"token_hash"`  // SHA-256 hex of the raw token
    RevokedAt  time.Time `json:"revoked_at"`
    OrgID      uuid.UUID `json:"org_id"`      // for audit logging only
}

2. Auth service — publisher

Go
// In services/auth/internal/service/token_service.go,
// after successfully marking the token revoked in Postgres:
 
hash := sha256.Sum256([]byte(rawToken))
event := RevocationEvent{
    Version:   1,
    TokenHash: hex.EncodeToString(hash[:]),
    RevokedAt: time.Now().UTC(),
    OrgID:     orgID,
}
payload, _ := json.Marshal(event)
// Publish non-blocking; revocation is durable in Postgres even if Redis is down
go func() {
    if err := redis.Publish(ctx, "ibex:token:revocations", payload).Err(); err != nil {
        log.WarnCtx(ctx, "revocation event publish failed; LRU will expire naturally", "error", err)
    }
}()

3. Proxy service — subscriber goroutine

Go
// In services/proxy/cmd/proxy/main.go, start the subscriber after CachingValidator is created:
revSub := revocation.NewSubscriber(redisClient, cachingValidator, log)
sd.Register(func(ctx context.Context) error {
    revSub.Stop()
    return nil
})
go revSub.Run(ctx) // blocks until stopped
 
// packages/revocation/subscriber.go
type Subscriber struct {
    redis   *redis.Client
    cache   *authcache.CachingValidator
    log     *logger.Logger
    stopCh  chan struct{}
}
 
func (s *Subscriber) Run(ctx context.Context) {
    pubsub := s.redis.Subscribe(ctx, "ibex:token:revocations")
    defer pubsub.Close()
 
    for {
        select {
        case msg := <-pubsub.Channel():
            var event RevocationEvent
            if err := json.Unmarshal([]byte(msg.Payload), &event); err != nil {
                s.log.WarnCtx(ctx, "malformed revocation event", "error", err)
                continue
            }
            s.cache.Invalidate(event.TokenHash)
        case <-s.stopCh:
            return
        case <-ctx.Done():
            return
        }
    }
}

Acceptance Criteria

  • Token revoked via RevokeToken gRPC → pub/sub event published within 100ms
  • All proxy instances subscribed to channel receive the event within 500ms
  • LRU entry invalidated within 1 second of revocation (end-to-end SLA)
  • Redis pub/sub failure does not fail RevokeToken (Postgres write is durable)
  • Subscriber goroutine terminates cleanly on SIGTERM (via packages/shutdown)

Edit on GitHub

Last updated on

On this page

0%