Rate Limit Architecture¶

1. Overview¶

Purpose¶

Provide production-ready, distributed rate limiting for Dashtam API endpoints using Token Bucket algorithm with Redis storage, integrated with hexagonal architecture and existing infrastructure (audit, logging, cache).

Key Requirements¶

Security First:

Protect authentication endpoints from brute force attacks (IP-scoped)
Protect API endpoints from abuse (user-scoped)
Protect provider APIs from excessive calls (user+provider scoped)
Audit all rate limit violations (PCI-DSS compliance)

Availability:

Fail-open design (never block requests if Redis fails)
Sub-5ms latency for rate limit checks
Atomic operations (no race conditions)

Hexagonal Architecture:

Domain layer: RateLimitProtocol (port)
Infrastructure layer: TokenBucketAdapter, RedisRateLimitStorage (adapters)
Application layer: rate_limit dependency
Presentation layer: RateLimitMiddleware

Integration Requirements:

Use existing CacheProtocol infrastructure
Use existing AuditProtocol for violation logging
Use existing LoggerProtocol for structured logging
Emit Domain Events for rate limit violations

2. Token Bucket Algorithm¶

Decision: Token Bucket¶

Why Token Bucket?

Allows bursts (better UX than fixed window)
Smooth traffic (prevents timing attacks)
Industry standard (AWS, Stripe, GitHub use this)
Memory efficient (2 values per key: tokens + last_refill_time)

Why not alternatives?

Algorithm	Pros	Cons	Decision
Token Bucket	Burst capacity, smooth refills	Slightly complex	✅ Selected
Fixed Window	Simple	Boundary burst attacks	❌ Rejected
Sliding Window Log	Precise	High memory (timestamps)	❌ Rejected
Leaky Bucket	Smooth output	No burst capacity	❌ Rejected

How Token Bucket Works¶

┌─────────────────────────────────────────────────────────────┐
│                 Token Bucket Algorithm                      │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   Configuration:                                            │
│   - max_tokens: 20 (bucket capacity)                        │
│   - refill_rate: 5.0 (tokens per minute)                    │
│                                                             │
│   ┌──────────────┐                                          │
│   │ Token Bucket │ ◄─── Refill: 1 token every 12 seconds    │
│   │              │                                          │
│   │  ████████    │  Current: 8 tokens                       │
│   │  ████████    │                                          │
│   │              │  Capacity: 20 tokens                     │
│   └──────┬───────┘                                          │
│          │                                                  │
│          ▼                                                  │
│   Request arrives (cost=1):                                 │
│   - If tokens >= cost: ALLOW, consume tokens                │
│   - If tokens < cost: DENY, return retry_after              │
│                                                             │
│   Burst Example:                                            │
│   - 20 requests arrive at once                              │
│   - All 20 allowed (burst capacity)                         │
│   - Request 21 denied, retry_after=12s                      │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Token Bucket Formula:

# Calculate current tokens
elapsed_seconds = now - last_refill_time
tokens_to_add = elapsed_seconds * (refill_rate / 60.0)
current_tokens = min(tokens + tokens_to_add, max_tokens)

# Check if allowed
if current_tokens >= cost:
    new_tokens = current_tokens - cost
    return (True, 0.0)  # Allowed
else:
    tokens_needed = cost - current_tokens
    retry_after = tokens_needed / (refill_rate / 60.0)
    return (False, retry_after)  # Denied

3. Redis Storage with Lua Scripts¶

Decision: Atomic Lua Scripts¶

Why Lua Scripts?

Atomicity: Script runs inside Redis (no race conditions)
Performance: Single Redis roundtrip (~2-3ms)
Correctness: Check-and-consume happens atomically
Simplicity: No WATCH/MULTI/EXEC complexity

Why not alternatives?

Approach	Pros	Cons	Decision
Lua Script	Atomic, fast	Learning curve	✅ Selected
WATCH/MULTI	Native Redis	Race conditions	❌ Rejected
Multiple GETs	Simple	Not atomic	❌ Rejected
PostgreSQL	Persistent	Too slow (~10ms)	❌ Rejected

Lua Script Implementation¶

-- token_bucket.lua
-- KEYS[1]: Redis key for bucket (e.g., "rate_limit:ip:192.168.1.1:login")
-- ARGV[1]: max_tokens (bucket capacity)
-- ARGV[2]: refill_rate (tokens per minute)
-- ARGV[3]: cost (tokens to consume)
-- ARGV[4]: current_timestamp (seconds since epoch)
-- Returns: [allowed (0/1), retry_after, remaining_tokens]

local tokens_key = KEYS[1] .. ":tokens"
local time_key = KEYS[1] .. ":time"

local max_tokens = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local cost = tonumber(ARGV[3])
local now = tonumber(ARGV[4])

-- Get current state
local current_tokens = tonumber(redis.call("GET", tokens_key))
local last_refill_time = tonumber(redis.call("GET", time_key))

-- Initialize if first request
if not current_tokens or not last_refill_time then
    current_tokens = max_tokens
    last_refill_time = now
end

-- Calculate tokens to refill
local elapsed_seconds = now - last_refill_time
local tokens_to_add = elapsed_seconds * (refill_rate / 60.0)
current_tokens = math.min(current_tokens + tokens_to_add, max_tokens)

-- TTL = time to refill + 60s buffer
local ttl = math.ceil((max_tokens / refill_rate) * 60) + 60

if current_tokens >= cost then
    -- Allowed: consume tokens
    local new_tokens = current_tokens - cost
    redis.call("SETEX", tokens_key, ttl, new_tokens)
    redis.call("SETEX", time_key, ttl, now)
    return {1, 0, math.floor(new_tokens)}
else
    -- Denied: calculate retry_after
    local tokens_needed = cost - current_tokens
    local retry_after = tokens_needed / (refill_rate / 60.0)
    redis.call("SETEX", tokens_key, ttl, current_tokens)
    redis.call("SETEX", time_key, ttl, now)
    return {0, retry_after, math.floor(current_tokens)}
end

Redis Key Structure¶

rate_limit:{scope}:{identifier}:{endpoint}:tokens  → float (current tokens)
rate_limit:{scope}:{identifier}:{endpoint}:time    → float (last refill timestamp)

Examples:
- rate_limit:ip:192.168.1.1:POST /api/v1/auth/login:tokens
- rate_limit:user:abc-123:GET /api/v1/accounts:tokens
- rate_limit:user_provider:abc-123:schwab:sync:tokens

4. Hexagonal Architecture Integration¶

Layer Responsibilities¶

┌─────────────────────────────────────────────────────────────┐
│ Presentation Layer (FastAPI)                                │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │  RateLimitMiddleware                                    │ │
│ │  - Intercepts all HTTP requests                         │ │
│ │  - Extracts endpoint key and identifier                 │ │
│ │  - Returns HTTP 429 with Retry-After header             │ │
│ │  - Adds X-RateLimit-* headers to responses              │ │
│ └───────────────────────────┬─────────────────────────────┘ │
└─────────────────────────────┼───────────────────────────────┘
                              │ uses
                              ↓
┌─────────────────────────────────────────────────────────────┐
│ Application Layer                                           │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │  rate_limit Dependency                                  │ │
│ │  - FastAPI Depends() for endpoint-level rate limits     │ │
│ │  - Publishes RateLimitViolation domain events           │ │
│ └───────────────────────────┬─────────────────────────────┘ │
└─────────────────────────────┼───────────────────────────────┘
                              │ uses
                              ↓
┌─────────────────────────────────────────────────────────────┐
│ Domain Layer (Protocols)                                    │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │  RateLimitProtocol (PORT)                               │ │
│ │  - is_allowed(endpoint, identifier, cost) -> Result     │ │
│ │  - get_remaining(endpoint, identifier) -> Result        │ │
│ │  - reset(endpoint, identifier) -> Result                │ │
│ └───────────────────────────┬─────────────────────────────┘ │
│                             │                               │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │  RateLimitRule (Value Object)                           │ │
│ │  - max_tokens, refill_rate, scope, cost, enabled        │ │
│ └─────────────────────────────────────────────────────────┘ │
│                             │                               │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │  RateLimitScope (Enum)                                  │ │
│ │  - IP, USER, USER_PROVIDER, GLOBAL                      │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────┼───────────────────────────────┘
                              ↑ implements
                              │
┌─────────────────────────────────────────────────────────────┐
│ Infrastructure Layer                                        │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │  TokenBucketAdapter (ADAPTER)                           │ │
│ │  - Implements RateLimitProtocol                         │ │
│ │  - Coordinates storage, audit, logging                  │ │
│ │  - Fail-open error handling                             │ │
│ └───────────────────────────┬─────────────────────────────┘ │
│                             │ uses                          │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │  RedisStorage                                           │ │
│ │  - Executes Lua scripts for atomic operations           │ │
│ │  - Uses existing Redis connection (via CacheProtocol)   │ │
│ │  - TTL-based automatic cleanup                          │ │
│ └───────────────────────────┬─────────────────────────────┘ │
│                             │ uses                          │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │  RateLimitConfig (SSOT)                                 │ │
│ │  - Per-endpoint rate limit rules                        │ │
│ │  - Scope configuration                                  │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Integration with Existing Infrastructure¶

┌─────────────────────────────────────────────────────────────┐
│                 Rate Limit Integration                      │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   TokenBucketAdapter                                        │
│         │                                                   │
│         ├──► LoggerProtocol                                 │
│         │    - Structured logging of all decisions          │
│         │    - Execution time tracking                      │
│         │                                                   │
│         ├──► AuditProtocol                                  │
│         │    - RATE_LIMIT_CHECK_ATTEMPTED                   │
│         │    - RATE_LIMIT_CHECK_ALLOWED                     │
│         │    - RATE_LIMIT_CHECK_DENIED                      │
│         │                                                   │
│         ├──► EventBusProtocol                               │
│         │    - RateLimitCheckAttempted                      │
│         │    - RateLimitCheckSucceeded/Failed               │
│         │                                                   │
│         └──► Redis                                          │
│              - Direct redis-py client for Lua scripts       │
│              - Separate from CacheProtocol (different ops)  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

5. Rate Limit Rules Configuration¶

Single Source of Truth (SSOT)¶

# src/infrastructure/rate_limit/config.py

RATE_LIMIT_RULES: dict[str, RateLimitRule] = {
    # Authentication endpoints (IP-scoped, restrictive)
    "POST /api/v1/auth/login": RateLimitRule(
        max_tokens=5,
        refill_rate=5.0,  # 5 tokens/min = 1 per 12 seconds
        scope=RateLimitScope.IP,
        cost=1,
        enabled=True,
    ),
    "POST /api/v1/auth/register": RateLimitRule(
        max_tokens=3,
        refill_rate=3.0,  # 3 tokens/min
        scope=RateLimitScope.IP,
        cost=1,
        enabled=True,
    ),
    "POST /api/v1/auth/password-reset": RateLimitRule(
        max_tokens=3,
        refill_rate=3.0,
        scope=RateLimitScope.IP,
        cost=1,
        enabled=True,
    ),

    # User API endpoints (user-scoped, generous)
    "GET /api/v1/accounts": RateLimitRule(
        max_tokens=100,
        refill_rate=100.0,  # 100 tokens/min
        scope=RateLimitScope.USER,
        cost=1,
        enabled=True,
    ),
    "GET /api/v1/transactions": RateLimitRule(
        max_tokens=100,
        refill_rate=100.0,
        scope=RateLimitScope.USER,
        cost=1,
        enabled=True,
    ),

    # Provider sync (user+provider scoped, expensive)
    "POST /api/v1/providers/{provider_id}/sync": RateLimitRule(
        max_tokens=10,
        refill_rate=10.0,  # 10 syncs/min per user per provider
        scope=RateLimitScope.USER_PROVIDER,
        cost=1,
        enabled=True,
    ),
}

Scope Types¶

Scope	Key Format	Use Case
`IP`	`ip:{address}:{endpoint}`	Unauthenticated endpoints (login)
`USER`	`user:{user_id}:{endpoint}`	Authenticated API endpoints
`USER_PROVIDER`	`user_provider:{user_id}:{provider}:{endpoint}`	Provider-specific operations
`GLOBAL`	`global:{endpoint}`	System-wide limits (rare)

Variable Cost¶

# Expensive operations cost more tokens
"POST /api/v1/reports/generate": RateLimitRule(
    max_tokens=10,
    refill_rate=10.0,
    scope=RateLimitScope.USER,
    cost=5,  # Costs 5 tokens (expensive operation)
    enabled=True,
)

Registry Pattern (F8.3)¶

Pattern: RATE_LIMIT_RULES follows the Registry Pattern - same architectural approach as Domain Events Registry (F7.7) and Provider Registry (F8.1).

Purpose: Single source of truth for all rate limit rules with self-enforcing validation to prevent configuration drift.

Why Registry Pattern?¶

Before F8.3 (Configuration Only):

✅ Rate limit rules centralized in RATE_LIMIT_RULES dict
✅ Single source of truth for endpoint mappings
❌ No validation that rules have complete configuration
❌ No tests to detect malformed patterns
❌ Drift risk: Could add rule with invalid scope, negative tokens, etc.

After F8.3 (Registry Pattern + Compliance Tests):

✅ Self-enforcing validation: Tests fail if rules incomplete
✅ Zero drift: Can't merge PR with malformed rules
✅ Comprehensive checks: 23 tests across 5 test classes
✅ 100% coverage: Rate limit config module fully validated

Self-Enforcing Compliance Tests¶

File: tests/unit/test_rate_limit_registry_compliance.py (347 lines, 23 tests)

Test Coverage:

Registry Completeness (8 tests)
All rules have positive max_tokens, refill_rate, cost
All rules have valid RateLimitScope enum
All rules have boolean enabled flag
Endpoint patterns follow METHOD /path format
No duplicate endpoints
All values are RateLimitRule instances
Rule Consistency (4 tests)
Auth endpoints use IP or USER scope (security)
API endpoints use USER scope (standard pattern)
Burst capacity (max_tokens) ≥ refill_rate (usability)
Most rules use cost=1 (standard request cost)
Pattern Matching (4 tests)
Exact match returns correct rule
Path parameter matching works ({account_id} → UUID)
Non-existent endpoints return None
Method mismatch returns None
Registry Statistics (4 tests)
Minimum 20 endpoint rules (snapshot)
Scope distribution matches patterns (USER > IP)
All rules enabled except GLOBAL emergency brake
Critical endpoints have explicit rules
Future-Proofing (3 tests)
No wildcard patterns (*, ?)
Paths use lowercase (consistency)
No trailing slashes (consistency)

Example Test (prevents drift):

def test_all_rules_have_positive_max_tokens():
    """Every rule must have positive max_tokens (bucket capacity)."""
    for endpoint, rule in RATE_LIMIT_RULES.items():
        assert rule.max_tokens > 0, (
            f"Rule for '{endpoint}' has invalid max_tokens: {rule.max_tokens}. "
            "Must be positive integer."
        )

Result: If someone adds a rule with max_tokens=0 or max_tokens=-5, tests fail → can't merge → zero drift.

Registry Pattern Consistency¶

Rate Limit Rules Registry follows the same pattern as:

F7.7: Domain Events Registry:

Metadata-driven catalog (EventMetadata → rate limit rules in dict)
Self-enforcing tests (prevent incomplete wiring → prevent invalid config)
Helper function (get_rule_for_endpoint() → lookup by endpoint pattern)
Statistics function (event counts → endpoint counts)

F8.1: Provider Registry:

Single source of truth (PROVIDER_REGISTRY → RATE_LIMIT_RULES)
Enum for categories (ProviderCategory → RateLimitScope)
Self-enforcing tests (19 tests → 23 tests)
100% coverage target (provider registry → rate limit config)

Pattern Benefits:

Zero Drift: Tests fail if rules incomplete or malformed
Self-Documenting: Registry is the documentation (25 endpoints mapped)
Onboarding: New developers see complete rule catalog in one place
Refactoring Safety: Change scope enum → tests fail if missed
Audit Trail: Git history shows all rule changes

Reference:

docs/architecture/registry.md - Registry Pattern theory
docs/architecture/domain-events.md - F7.7 Domain Events Registry (Section 5.1)
docs/architecture/provider-registry.md - F8.1 Provider Registry
tests/unit/test_rate_limit_registry_compliance.py - Self-enforcing tests

6. Fail-Open Strategy¶

Design Philosophy¶

Rate limit failures MUST NEVER cause denial-of-service. The system fails open at every layer.

┌─────────────────────────────────────────────────────────────┐
│                 Fail-Open Architecture                      │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   Layer 1: Middleware                                       │
│   ┌─────────────────────────────────────────────────────┐   │
│   │ try:                                                │   │
│   │     result = rate_limit.is_allowed(...)             │   │
│   │ except Exception:                                   │   │
│   │     logger.error("Rate limit failed")               │   │
│   │     return await call_next(request)  # ALLOW        │   │
│   └─────────────────────────────────────────────────────┘   │
│                                                             │
│   Layer 2: TokenBucketAdapter                               │
│   ┌─────────────────────────────────────────────────────┐   │
│   │ try:                                                │   │
│   │     result = storage.check_and_consume(...)         │   │
│   │ except Exception:                                   │   │
│   │     logger.error("Algorithm failed")                │   │
│   │     return Success(RateLimitResult(allowed=True))   │   │
│   └─────────────────────────────────────────────────────┘   │
│                                                             │
│   Layer 3: RedisRateLimitStorage                            │
│   ┌─────────────────────────────────────────────────────┐   │
│   │ try:                                                │   │
│   │     result = redis.evalsha(lua_script, ...)         │   │
│   │ except RedisError:                                  │   │
│   │     logger.error("Redis failed")                    │   │
│   │     return (True, 0.0, max_tokens)  # ALLOW         │   │
│   └─────────────────────────────────────────────────────┘   │
│                                                             │
│   Layer 4: Audit Logging                                    │
│   ┌─────────────────────────────────────────────────────┐   │
│   │ try:                                                │   │
│   │     audit.record(RATE_LIMIT_CHECK_DENIED, ...)      │   │
│   │ except Exception:                                   │   │
│   │     logger.error("Audit failed")                    │   │
│   │     # Continue without audit (don't block)          │   │
│   └─────────────────────────────────────────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Fail-Open Logging¶

All fail-open events are logged for monitoring:

logger.error(
    "Rate limit fail-open",
    extra={
        "endpoint": endpoint,
        "identifier": identifier,
        "error": str(e),
        "layer": "storage",  # middleware, adapter, storage, audit
        "result": "fail_open",
    },
)

Alert Threshold: If fail_open events > 10/minute, trigger ops alert (rate limit degraded).

7. HTTP Response Headers¶

Rate Limit Headers (RFC 6585)¶

On Allowed Requests:

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 60

On Rate Limited Requests:

HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 5
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 12
Content-Type: application/json

{
    "type": "{api_base_url}/errors/rate-limit-exceeded",
    "title": "Rate Limit Exceeded",
    "status": 429,
    "detail": "Too many requests. Please try again in 12 seconds.",
    "instance": "/api/v1/auth/login",
    "retry_after": 12
}

Note: {api_base_url} is dynamically set from settings.api_base_url (e.g., https://api.dashtam.local in dev, https://api.dashtam.com in prod).

Header Definitions¶

Header	Description
`Retry-After`	Seconds until retry allowed (RFC 6585)
`X-RateLimit-Limit`	Maximum requests in window
`X-RateLimit-Remaining`	Requests remaining
`X-RateLimit-Reset`	Seconds until bucket fully refills

8. Domain Events¶

Event Definitions¶

# src/domain/events/rate_limit_events.py

@dataclass(frozen=True, kw_only=True)
class RateLimitCheckAttempted(DomainEvent):
    """Emitted before rate limit check."""
    endpoint: str
    identifier: str
    scope: str
    cost: int

@dataclass(frozen=True, kw_only=True)
class RateLimitCheckAllowed(DomainEvent):
    """Emitted when request is allowed."""
    endpoint: str
    identifier: str
    scope: str
    remaining_tokens: int
    execution_time_ms: float

@dataclass(frozen=True, kw_only=True)
class RateLimitCheckDenied(DomainEvent):
    """Emitted when request is rate limited."""
    endpoint: str
    identifier: str
    scope: str
    retry_after: float
    execution_time_ms: float

Event Handlers¶

RateLimitCheckDenied
    ├──► LoggingEventHandler    → Structured log (WARNING)
    ├──► AuditEventHandler      → Audit record (compliance)
    └──► AlertEventHandler      → (future) Slack/PagerDuty alerts

9. Audit Trail Integration¶

Audit Actions¶

# Add to src/domain/enums/audit_action.py

# Rate Limiter audit actions (ATTEMPT → OUTCOME pattern)
RATE_LIMIT_CHECK_ATTEMPTED = "rate_limit_check_attempted"
RATE_LIMIT_CHECK_ALLOWED = "rate_limit_check_allowed"
RATE_LIMIT_CHECK_DENIED = "rate_limit_check_denied"

Audit Record Example¶

{
    "action": "RATE_LIMIT_CHECK_DENIED",
    "resource_type": "rate_limit",
    "resource_id": "POST /api/v1/auth/login",
    "user_id": null,
    "ip_address": "203.0.113.42",
    "metadata": {
        "scope": "ip",
        "identifier": "ip:203.0.113.42",
        "retry_after": 12.5,
        "limit": 5,
        "remaining": 0
    }
}

10. Request Flow¶

Complete Flow Diagram¶

sequenceDiagram
    participant Client
    participant Middleware as RateLimitMiddleware
    participant Adapter as TokenBucketAdapter
    participant Storage as RedisRateLimitStorage
    participant Redis
    participant Audit as AuditProtocol
    participant Endpoint as API Endpoint

    Client->>Middleware: HTTP Request

    Middleware->>Middleware: Extract endpoint key
    Middleware->>Middleware: Extract identifier (JWT/IP)

    Middleware->>Adapter: is_allowed(endpoint, identifier)
    Adapter->>Storage: check_and_consume(key, rule)
    Storage->>Redis: EVALSHA token_bucket.lua
    Redis-->>Storage: [allowed, retry_after, remaining]
    Storage-->>Adapter: (allowed, retry_after, remaining)

    alt Rate Limited
        Adapter->>Audit: record(RATE_LIMIT_CHECK_DENIED)
        Adapter-->>Middleware: Result(allowed=False)
        Middleware-->>Client: HTTP 429 + Retry-After
    else Allowed
        Adapter->>Audit: record(RATE_LIMIT_CHECK_ALLOWED)
        Adapter-->>Middleware: Result(allowed=True)
        Middleware->>Endpoint: call_next(request)
        Endpoint-->>Middleware: Response
        Middleware-->>Client: HTTP 200 + X-RateLimit-* headers
    end

Latency Breakdown¶

Stage	Operation	Latency (p95)
1	Middleware entry	<0.1ms
2	Endpoint key extraction	<0.5ms
3	Identifier extraction (JWT)	1-2ms
4	Redis Lua execution	2-3ms
5	Audit logging (async)	Non-blocking
Total	Allowed path	~5ms

11. Security Considerations¶

Threats Addressed¶

Threat	Mitigation
Brute force login	IP-scoped limits (5/min)
API abuse	User-scoped limits (100/min)
Provider API exhaustion	User+provider scoped limits
DDoS amplification	Per-endpoint limits

Security Best Practices¶

IP extraction: Handle X-Forwarded-For properly (take first IP)
Identifier sanitization: Validate UUIDs, sanitize IPs
No sensitive data in logs: Log identifier hashes if needed
Audit trail: All violations logged for forensics

12. Testing Strategy¶

Unit Tests¶

test_infrastructure_token_bucket.py - Algorithm logic (20+ tests)
test_domain_rate_limit_rule.py - Value object validation (10+ tests)
test_domain_rate_limit_scope.py - Key building (15+ tests)

Integration Tests¶

test_infrastructure_redis_rate_limit_storage.py - Lua script execution (20+ tests)
test_infrastructure_token_bucket_adapter.py - Full adapter flow (15+ tests)

API Tests¶

test_rate_limit_middleware.py - HTTP 429 responses (15+ tests)
test_rate_limit_headers.py - Header correctness (10+ tests)

Coverage Target¶

85%+ overall coverage
95%+ for Lua script logic (critical path)

13. File Structure¶

src/domain/
├── protocols/
│   └── rate_limit_protocol.py      # RateLimitProtocol
├── value_objects/
│   └── rate_limit_rule.py          # RateLimitRule value object
├── enums/
│   └── rate_limit_scope.py         # RateLimitScope enum (single source)
├── errors/
│   └── rate_limit_error.py         # RateLimitError domain error
└── events/
    └── rate_limit_events.py        # Domain events

src/infrastructure/
└── rate_limit/
    ├── __init__.py
    ├── token_bucket_adapter.py     # TokenBucketAdapter (implements RateLimitProtocol)
    ├── redis_storage.py            # RedisStorage (atomic Lua operations)
    ├── config.py                   # RATE_LIMIT_RULES (SSOT)
    └── lua_scripts/
        └── token_bucket.lua        # Atomic Lua script

src/presentation/
└── api/
    └── middleware/
        └── rate_limit_middleware.py # RateLimitMiddleware (Starlette)

src/application/
└── dependencies/
    └── rate_limit.py               # rate_limit FastAPI dependency

tests/
├── unit/
│   ├── test_domain_rate_limit_rule.py
│   ├── test_domain_rate_limit_scope.py
│   └── test_infrastructure_token_bucket.py
├── integration/
│   ├── test_infrastructure_redis_rate_limit_storage.py
│   └── test_infrastructure_token_bucket_adapter.py
└── api/
    ├── test_rate_limit_middleware.py
    └── test_rate_limit_headers.py

14. Future Enhancements¶

Dynamic rule updates: Hot-reload rules without restart
Sliding window: Additional algorithm option
Circuit breaker: Automatic backoff for repeat offenders
Dashboard: Admin UI for viewing rate limit status
Distributed Redis: Redis Cluster support for higher scale

Created: 2025-11-28 | Last Updated: 2026-01-10