Rate Limit Usage Guide¶

Quick reference guide for developers working with rate limiting in Dashtam.

Target Audience: Developers implementing rate-limited endpoints

Related Documentation:

Architecture: docs/architecture/rate-limit.md (why/what)

Quick Reference¶

Endpoint	Limit	Scope	Burst
`POST /sessions` (login)	5/min	IP	5
`POST /users` (register)	3/min	IP	3
`POST /password-resets`	3/min	IP	3
`GET /accounts`	100/min	User	100
`GET /transactions`	100/min	User	100
`POST /providers/{id}/sync`	10/min	User+Provider	10

1. How Token Bucket Works¶

Algorithm Overview¶

flowchart TD
    A[Request Arrives] --> B{Tokens >= Cost?}
    B -->|Yes| C[Consume Token]
    C --> D[ALLOW Request]
    D --> E[Return remaining tokens]
    B -->|No| F[Calculate retry_after]
    F --> G[DENY Request - 429]
    G --> H[Return Retry-After header]

    subgraph Refill["Background: Token Refill"]
        R1[Every interval] --> R2[Add tokens up to max]
    end

Configuration Example:

max_tokens: 20 (bucket capacity / burst limit)
refill_rate: 5.0 (tokens added per minute)
cost: 1 (tokens consumed per request)

Key Properties¶

Burst capacity: Allows initial burst up to max_tokens
Smooth refill: Tokens added gradually (refill_rate per minute)
Fair: Each request costs 1 token (configurable)
Atomic: Redis Lua script ensures no race conditions

2. Checking Rate Limits in Code¶

Using Rate Limit Dependency¶

from fastapi import APIRouter, Depends
from src.application.dependencies.rate_limit import check_rate_limit
from src.domain.protocols import RateLimitProtocol

router = APIRouter()

@router.post("/sessions")
async def login(
    data: LoginRequest,
    request: Request,
    rate_limit: RateLimitProtocol = Depends(get_rate_limit),
) -> LoginResponse:
    """Login with rate limiting."""
    # Check rate limit
    result = await rate_limit.is_allowed(
        endpoint="POST /api/v1/sessions",
        identifier=request.client.host,  # IP-scoped
    )

    if isinstance(result, Success) and not result.value.allowed:
        raise HTTPException(
            status_code=429,
            detail="Too many login attempts",
            headers={"Retry-After": str(int(result.value.retry_after))},
        )

    # Continue with login logic...

Using Middleware (Automatic)¶

Rate limiting is applied automatically via middleware for configured endpoints:

# src/presentation/routers/api/middleware/rate_limit_middleware.py
class RateLimitMiddleware:
    async def __call__(self, request: Request, call_next):
        endpoint = f"{request.method} {request.url.path}"

        # Get identifier based on scope
        identifier = self._get_identifier(request, endpoint)

        # Check rate limit
        result = await self._rate_limit.is_allowed(
            endpoint=endpoint,
            identifier=identifier,
        )

        if isinstance(result, Success) and not result.value.allowed:
            return JSONResponse(
                status_code=429,
                content={"detail": "Rate limit exceeded"},
                headers={
                    "Retry-After": str(int(result.value.retry_after)),
                    "X-RateLimit-Limit": str(result.value.limit),
                    "X-RateLimit-Remaining": str(result.value.remaining),
                    "X-RateLimit-Reset": str(result.value.reset_seconds),
                },
            )

        # Add rate limit headers to successful responses
        response = await call_next(request)
        if isinstance(result, Success):
            response.headers["X-RateLimit-Limit"] = str(result.value.limit)
            response.headers["X-RateLimit-Remaining"] = str(result.value.remaining)
            response.headers["X-RateLimit-Reset"] = str(result.value.reset_seconds)

        return response

3. Configuring Rate Limit Rules (Two-Tier Pattern)¶

Rate limits use a two-tier configuration pattern (similar to CSS classes):

Tier 1: Policy Assignment (registry.py)¶

Assign a rate limit policy to each endpoint in the Route Metadata Registry:

# src/presentation/routers/api/v1/routes/registry.py
RouteMetadata(
    method=HTTPMethod.POST,
    path="/sessions",
    handler=create_session,
    rate_limit_policy=RateLimitPolicy.AUTH_LOGIN,  # Policy assignment
    ...
)

Tier 2: Policy Implementation (derivations.py)¶

Define what each policy means:

# src/presentation/routers/api/v1/routes/derivations.py
mapping = {
    RateLimitPolicy.AUTH_LOGIN: RateLimitRule(
        max_tokens=5,
        refill_rate=5.0,  # 5 per minute
        scope=RateLimitScope.IP,
        cost=1,
        enabled=True,
    ),
    RateLimitPolicy.AUTH_REGISTER: RateLimitRule(
        max_tokens=3,
        refill_rate=3.0,
        scope=RateLimitScope.IP,
        cost=1,
        enabled=True,
    ),
    RateLimitPolicy.API_READ: RateLimitRule(
        max_tokens=100,
        refill_rate=100.0,
        scope=RateLimitScope.USER,
        cost=1,
        enabled=True,
    ),
    RateLimitPolicy.PROVIDER_SYNC: RateLimitRule(
        max_tokens=10,
        refill_rate=5.0,
        scope=RateLimitScope.USER_PROVIDER,
        cost=1,
        enabled=True,
    ),
}

To Modify Rate Limits¶

Scenario 1: Change rate limit for ONE specific endpoint

# In registry.py: Change policy assignment
RouteMetadata(
    path="/sessions",
    rate_limit_policy=RateLimitPolicy.API_READ,  # More generous
)

Scenario 2: Change rate limit for ALL endpoints using a policy

# In derivations.py: Update policy implementation
RateLimitPolicy.AUTH_LOGIN: RateLimitRule(
    max_tokens=10,  # Increase from 5 to 10
    ...
)

Scope Types¶

Scope	Key Format	Use Case
`IP`	`rate_limit:ip:{address}:{endpoint}`	Unauthenticated (login)
`USER`	`rate_limit:user:{user_id}:{endpoint}`	Authenticated API
`USER_PROVIDER`	`rate_limit:user_provider:{user_id}:{provider}:{endpoint}`	Provider ops
`GLOBAL`	`rate_limit:global:{endpoint}`	System-wide limits

Variable Cost¶

# Expensive operations cost more tokens
"POST /api/v1/reports/generate": RateLimitRule(
    max_tokens=10,
    refill_rate=10.0,
    scope=RateLimitScope.USER,
    cost=5,  # Costs 5 tokens (expensive operation)
    enabled=True,
),

4. Response Headers¶

On Allowed Requests¶

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 60

On Rate Limited Requests¶

HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 5
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 12
Content-Type: application/json

{
    "type": "https://api.dashtam.local/errors/rate-limit-exceeded",
    "title": "Rate Limit Exceeded",
    "status": 429,
    "detail": "Too many requests. Please try again in 12 seconds.",
    "instance": "/api/v1/sessions",
    "retry_after": 12
}

Header Definitions¶

Header	Description
`Retry-After`	Seconds until retry allowed (RFC 6585)
`X-RateLimit-Limit`	Maximum tokens (bucket capacity)
`X-RateLimit-Remaining`	Tokens remaining
`X-RateLimit-Reset`	Seconds until bucket refills

5. Fail-Open Design¶

Principle¶

Rate limit failures MUST NEVER cause denial-of-service.

Implementation¶

# Layer 1: Middleware
try:
    result = await rate_limit.is_allowed(...)
except Exception:
    logger.error("Rate limit failed - allowing request")
    return await call_next(request)  # ALLOW

# Layer 2: TokenBucketAdapter
try:
    result = await storage.check_and_consume(...)
except Exception:
    logger.error("Storage failed - allowing request")
    return Success(RateLimitResult(allowed=True, ...))  # ALLOW

# Layer 3: RedisStorage
try:
    result = await redis.evalsha(lua_script, ...)
except RedisError:
    logger.error("Redis failed - allowing request")
    return (True, 0.0, max_tokens)  # ALLOW

Monitoring Fail-Opens¶

# Alert if fail_open events > 10/minute
logger.error(
    "Rate limit fail-open",
    extra={
        "endpoint": endpoint,
        "identifier": identifier,
        "error": str(e),
        "layer": "storage",
        "result": "fail_open",
    },
)

6. Domain Events¶

Events Emitted¶

# Before rate limit check
RateLimitCheckAttempted
{
    "endpoint": "POST /api/v1/sessions",
    "identifier": "192.168.1.1",
    "scope": "ip",
    "cost": 1,
}

# On allowed request
RateLimitCheckAllowed
{
    "endpoint": "POST /api/v1/sessions",
    "identifier": "192.168.1.1",
    "scope": "ip",
    "remaining_tokens": 4,
    "execution_time_ms": 2.5,
}

# On denied request
RateLimitCheckDenied
{
    "endpoint": "POST /api/v1/sessions",
    "identifier": "192.168.1.1",
    "scope": "ip",
    "retry_after": 12.5,
    "execution_time_ms": 2.3,
}

7. Redis Lua Script¶

Atomic Token Bucket¶

-- token_bucket.lua
-- KEYS[1]: Redis key for bucket
-- ARGV[1]: max_tokens
-- ARGV[2]: refill_rate (tokens per minute)
-- ARGV[3]: cost
-- ARGV[4]: current_timestamp
-- Returns: [allowed (0/1), retry_after, remaining_tokens]

local tokens_key = KEYS[1] .. ":tokens"
local time_key = KEYS[1] .. ":time"

local max_tokens = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local cost = tonumber(ARGV[3])
local now = tonumber(ARGV[4])

-- Get current state
local current_tokens = tonumber(redis.call("GET", tokens_key))
local last_refill = tonumber(redis.call("GET", time_key))

-- Initialize if first request
if not current_tokens then
    current_tokens = max_tokens
    last_refill = now
end

-- Calculate refilled tokens
local elapsed = now - last_refill
local tokens_to_add = elapsed * (refill_rate / 60.0)
current_tokens = math.min(current_tokens + tokens_to_add, max_tokens)

-- TTL for cleanup
local ttl = math.ceil((max_tokens / refill_rate) * 60) + 60

if current_tokens >= cost then
    -- Allowed: consume tokens
    local new_tokens = current_tokens - cost
    redis.call("SETEX", tokens_key, ttl, new_tokens)
    redis.call("SETEX", time_key, ttl, now)
    return {1, 0, math.floor(new_tokens)}
else
    -- Denied: calculate retry_after
    local tokens_needed = cost - current_tokens
    local retry_after = tokens_needed / (refill_rate / 60.0)
    redis.call("SETEX", tokens_key, ttl, current_tokens)
    redis.call("SETEX", time_key, ttl, now)
    return {0, retry_after, math.floor(current_tokens)}
end

8. Testing Rate Limits¶

Unit Testing Rules¶

def test_rate_limit_rule_validation():
    rule = RateLimitRule(
        max_tokens=5,
        refill_rate=5.0,
        scope=RateLimitScope.IP,
        cost=1,
        enabled=True,
    )

    assert rule.max_tokens == 5
    assert rule.refill_rate == 5.0
    assert rule.ttl_seconds == 120  # Calculated from capacity/rate

Integration Testing with Redis¶

async def test_rate_limit_allows_within_limit(redis_storage):
    """Test requests allowed when under limit."""
    rule = RateLimitRule(max_tokens=5, refill_rate=5.0, ...)

    for _ in range(5):
        result = await redis_storage.check_and_consume(
            key_base="test:ip:127.0.0.1",
            rule=rule,
            cost=1,
        )
        assert result.value[0] == True  # allowed

async def test_rate_limit_denies_over_limit(redis_storage):
    """Test 6th request denied after 5."""
    # ... consume 5 tokens ...

    result = await redis_storage.check_and_consume(...)
    assert result.value[0] == False  # denied
    assert result.value[1] > 0  # retry_after

API Testing¶

def test_login_rate_limit(client: TestClient):
    """Test login endpoint rate limiting."""
    # First 5 requests succeed
    for _ in range(5):
        response = client.post("/api/v1/sessions", json={
            "email": "test@example.com",
            "password": "wrong_password",
        })
        assert response.status_code in [401, 201]

    # 6th request rate limited
    response = client.post("/api/v1/sessions", json={
        "email": "test@example.com",
        "password": "wrong_password",
    })
    assert response.status_code == 429
    assert "Retry-After" in response.headers

9. Common Patterns¶

Pattern 1: IP-Scoped for Unauthenticated¶

# Login, registration, password reset
"POST /api/v1/sessions": RateLimitRule(
    max_tokens=5,
    refill_rate=5.0,
    scope=RateLimitScope.IP,  # IP address
    cost=1,
    enabled=True,
),

Pattern 2: User-Scoped for Authenticated¶

# API endpoints requiring auth
"GET /api/v1/accounts": RateLimitRule(
    max_tokens=100,
    refill_rate=100.0,
    scope=RateLimitScope.USER,  # User ID from JWT
    cost=1,
    enabled=True,
),

Pattern 3: Expensive Operations¶

# Report generation costs 5 tokens
"POST /api/v1/reports": RateLimitRule(
    max_tokens=10,
    refill_rate=10.0,
    scope=RateLimitScope.USER,
    cost=5,  # Higher cost
    enabled=True,
),

Pattern 4: Disable Rate Limit (Testing)¶

# Disable for specific endpoint
"GET /api/v1/health": RateLimitRule(
    max_tokens=1000,
    refill_rate=1000.0,
    scope=RateLimitScope.GLOBAL,
    cost=1,
    enabled=False,  # Disabled
),

Pattern 5: Admin Reset Rate Limit¶

@router.post("/admin/rate-limits/reset")
async def reset_rate_limit(
    data: ResetRateLimitRequest,
    current_user: User = Depends(get_current_user),
    _: None = Depends(require_role(UserRole.ADMIN)),
    rate_limit: RateLimitProtocol = Depends(get_rate_limit),
) -> None:
    """Admin: Reset rate limit for user/IP."""
    await rate_limit.reset(
        endpoint=data.endpoint,
        identifier=data.identifier,
    )

10. Adding New Rate Limit Rules¶

With the registry-based pattern, adding rate limits is automatic:

Step 1: Add Route to Registry¶

# src/presentation/routers/api/v1/routes/registry.py
RouteMetadata(
    method=HTTPMethod.POST,
    path="/new-endpoint",
    handler=new_endpoint_handler,
    rate_limit_policy=RateLimitPolicy.API_WRITE,  # Assign existing policy
    ...
)

Step 2: (Optional) Create New Policy¶

If existing policies don't fit:

# 1. Add enum value in metadata.py
class RateLimitPolicy(str, Enum):
    CUSTOM_NEW = "custom_new"

# 2. Add implementation in derivations.py
RateLimitPolicy.CUSTOM_NEW: RateLimitRule(
    max_tokens=20,
    refill_rate=20.0,
    scope=RateLimitScope.USER,
    cost=1,
    enabled=True,
),

Step 3: Test Rule¶

# tests/integration/test_rate_limit_new_endpoint.py
async def test_new_endpoint_rate_limit():
    # Test within limit
    # Test at limit
    # Test over limit
    # Test refill behavior

Step 4: Document¶

Update API documentation with rate limit information.

Troubleshooting¶

429 Too Many Requests unexpectedly¶

Check endpoint matches rule key exactly (method + path)
Check scope - IP vs User vs User+Provider
Check if rate limits were reset recently
Check Redis connectivity

Rate limit not applied¶

Check endpoint is in RATE_LIMIT_RULES
Check enabled=True in rule
Check middleware is registered
Check identifier extraction is correct

Retry-After shows wrong value¶

Check refill_rate configuration
Check server clock synchronization
Check Lua script timestamp handling

Rate limit headers missing¶

Check middleware is adding headers
Check response isn't being replaced
Check headers not stripped by proxy

Created: 2025-12-05 | Last Updated: 2026-01-10