Rate Limit Usage Guide¶
Quick reference guide for developers working with rate limiting in Dashtam.
Target Audience: Developers implementing rate-limited endpoints
Related Documentation:
- Architecture:
docs/architecture/rate-limit.md(why/what)
Quick Reference¶
| Endpoint | Limit | Scope | Burst |
|---|---|---|---|
POST /sessions (login) |
5/min | IP | 5 |
POST /users (register) |
3/min | IP | 3 |
POST /password-resets |
3/min | IP | 3 |
GET /accounts |
100/min | User | 100 |
GET /transactions |
100/min | User | 100 |
POST /providers/{id}/sync |
10/min | User+Provider | 10 |
1. How Token Bucket Works¶
Algorithm Overview¶
flowchart TD
A[Request Arrives] --> B{Tokens >= Cost?}
B -->|Yes| C[Consume Token]
C --> D[ALLOW Request]
D --> E[Return remaining tokens]
B -->|No| F[Calculate retry_after]
F --> G[DENY Request - 429]
G --> H[Return Retry-After header]
subgraph Refill["Background: Token Refill"]
R1[Every interval] --> R2[Add tokens up to max]
end
Configuration Example:
max_tokens: 20(bucket capacity / burst limit)refill_rate: 5.0(tokens added per minute)cost: 1(tokens consumed per request)
Key Properties¶
- Burst capacity: Allows initial burst up to
max_tokens - Smooth refill: Tokens added gradually (
refill_rateper minute) - Fair: Each request costs 1 token (configurable)
- Atomic: Redis Lua script ensures no race conditions
2. Checking Rate Limits in Code¶
Using Rate Limit Dependency¶
from fastapi import APIRouter, Depends
from src.application.dependencies.rate_limit import check_rate_limit
from src.domain.protocols import RateLimitProtocol
router = APIRouter()
@router.post("/sessions")
async def login(
data: LoginRequest,
request: Request,
rate_limit: RateLimitProtocol = Depends(get_rate_limit),
) -> LoginResponse:
"""Login with rate limiting."""
# Check rate limit
result = await rate_limit.is_allowed(
endpoint="POST /api/v1/sessions",
identifier=request.client.host, # IP-scoped
)
if isinstance(result, Success) and not result.value.allowed:
raise HTTPException(
status_code=429,
detail="Too many login attempts",
headers={"Retry-After": str(int(result.value.retry_after))},
)
# Continue with login logic...
Using Middleware (Automatic)¶
Rate limiting is applied automatically via middleware for configured endpoints:
# src/presentation/routers/api/middleware/rate_limit_middleware.py
class RateLimitMiddleware:
async def __call__(self, request: Request, call_next):
endpoint = f"{request.method} {request.url.path}"
# Get identifier based on scope
identifier = self._get_identifier(request, endpoint)
# Check rate limit
result = await self._rate_limit.is_allowed(
endpoint=endpoint,
identifier=identifier,
)
if isinstance(result, Success) and not result.value.allowed:
return JSONResponse(
status_code=429,
content={"detail": "Rate limit exceeded"},
headers={
"Retry-After": str(int(result.value.retry_after)),
"X-RateLimit-Limit": str(result.value.limit),
"X-RateLimit-Remaining": str(result.value.remaining),
"X-RateLimit-Reset": str(result.value.reset_seconds),
},
)
# Add rate limit headers to successful responses
response = await call_next(request)
if isinstance(result, Success):
response.headers["X-RateLimit-Limit"] = str(result.value.limit)
response.headers["X-RateLimit-Remaining"] = str(result.value.remaining)
response.headers["X-RateLimit-Reset"] = str(result.value.reset_seconds)
return response
3. Configuring Rate Limit Rules (Two-Tier Pattern)¶
Rate limits use a two-tier configuration pattern (similar to CSS classes):
Tier 1: Policy Assignment (registry.py)¶
Assign a rate limit policy to each endpoint in the Route Metadata Registry:
# src/presentation/routers/api/v1/routes/registry.py
RouteMetadata(
method=HTTPMethod.POST,
path="/sessions",
handler=create_session,
rate_limit_policy=RateLimitPolicy.AUTH_LOGIN, # Policy assignment
...
)
Tier 2: Policy Implementation (derivations.py)¶
Define what each policy means:
# src/presentation/routers/api/v1/routes/derivations.py
mapping = {
RateLimitPolicy.AUTH_LOGIN: RateLimitRule(
max_tokens=5,
refill_rate=5.0, # 5 per minute
scope=RateLimitScope.IP,
cost=1,
enabled=True,
),
RateLimitPolicy.AUTH_REGISTER: RateLimitRule(
max_tokens=3,
refill_rate=3.0,
scope=RateLimitScope.IP,
cost=1,
enabled=True,
),
RateLimitPolicy.API_READ: RateLimitRule(
max_tokens=100,
refill_rate=100.0,
scope=RateLimitScope.USER,
cost=1,
enabled=True,
),
RateLimitPolicy.PROVIDER_SYNC: RateLimitRule(
max_tokens=10,
refill_rate=5.0,
scope=RateLimitScope.USER_PROVIDER,
cost=1,
enabled=True,
),
}
To Modify Rate Limits¶
Scenario 1: Change rate limit for ONE specific endpoint
# In registry.py: Change policy assignment
RouteMetadata(
path="/sessions",
rate_limit_policy=RateLimitPolicy.API_READ, # More generous
)
Scenario 2: Change rate limit for ALL endpoints using a policy
# In derivations.py: Update policy implementation
RateLimitPolicy.AUTH_LOGIN: RateLimitRule(
max_tokens=10, # Increase from 5 to 10
...
)
Scope Types¶
| Scope | Key Format | Use Case |
|---|---|---|
IP |
rate_limit:ip:{address}:{endpoint} |
Unauthenticated (login) |
USER |
rate_limit:user:{user_id}:{endpoint} |
Authenticated API |
USER_PROVIDER |
rate_limit:user_provider:{user_id}:{provider}:{endpoint} |
Provider ops |
GLOBAL |
rate_limit:global:{endpoint} |
System-wide limits |
Variable Cost¶
# Expensive operations cost more tokens
"POST /api/v1/reports/generate": RateLimitRule(
max_tokens=10,
refill_rate=10.0,
scope=RateLimitScope.USER,
cost=5, # Costs 5 tokens (expensive operation)
enabled=True,
),
4. Response Headers¶
On Allowed Requests¶
On Rate Limited Requests¶
HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 5
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 12
Content-Type: application/json
{
"type": "https://api.dashtam.local/errors/rate-limit-exceeded",
"title": "Rate Limit Exceeded",
"status": 429,
"detail": "Too many requests. Please try again in 12 seconds.",
"instance": "/api/v1/sessions",
"retry_after": 12
}
Header Definitions¶
| Header | Description |
|---|---|
Retry-After |
Seconds until retry allowed (RFC 6585) |
X-RateLimit-Limit |
Maximum tokens (bucket capacity) |
X-RateLimit-Remaining |
Tokens remaining |
X-RateLimit-Reset |
Seconds until bucket refills |
5. Fail-Open Design¶
Principle¶
Rate limit failures MUST NEVER cause denial-of-service.
Implementation¶
# Layer 1: Middleware
try:
result = await rate_limit.is_allowed(...)
except Exception:
logger.error("Rate limit failed - allowing request")
return await call_next(request) # ALLOW
# Layer 2: TokenBucketAdapter
try:
result = await storage.check_and_consume(...)
except Exception:
logger.error("Storage failed - allowing request")
return Success(RateLimitResult(allowed=True, ...)) # ALLOW
# Layer 3: RedisStorage
try:
result = await redis.evalsha(lua_script, ...)
except RedisError:
logger.error("Redis failed - allowing request")
return (True, 0.0, max_tokens) # ALLOW
Monitoring Fail-Opens¶
# Alert if fail_open events > 10/minute
logger.error(
"Rate limit fail-open",
extra={
"endpoint": endpoint,
"identifier": identifier,
"error": str(e),
"layer": "storage",
"result": "fail_open",
},
)
6. Domain Events¶
Events Emitted¶
# Before rate limit check
RateLimitCheckAttempted
{
"endpoint": "POST /api/v1/sessions",
"identifier": "192.168.1.1",
"scope": "ip",
"cost": 1,
}
# On allowed request
RateLimitCheckAllowed
{
"endpoint": "POST /api/v1/sessions",
"identifier": "192.168.1.1",
"scope": "ip",
"remaining_tokens": 4,
"execution_time_ms": 2.5,
}
# On denied request
RateLimitCheckDenied
{
"endpoint": "POST /api/v1/sessions",
"identifier": "192.168.1.1",
"scope": "ip",
"retry_after": 12.5,
"execution_time_ms": 2.3,
}
7. Redis Lua Script¶
Atomic Token Bucket¶
-- token_bucket.lua
-- KEYS[1]: Redis key for bucket
-- ARGV[1]: max_tokens
-- ARGV[2]: refill_rate (tokens per minute)
-- ARGV[3]: cost
-- ARGV[4]: current_timestamp
-- Returns: [allowed (0/1), retry_after, remaining_tokens]
local tokens_key = KEYS[1] .. ":tokens"
local time_key = KEYS[1] .. ":time"
local max_tokens = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local cost = tonumber(ARGV[3])
local now = tonumber(ARGV[4])
-- Get current state
local current_tokens = tonumber(redis.call("GET", tokens_key))
local last_refill = tonumber(redis.call("GET", time_key))
-- Initialize if first request
if not current_tokens then
current_tokens = max_tokens
last_refill = now
end
-- Calculate refilled tokens
local elapsed = now - last_refill
local tokens_to_add = elapsed * (refill_rate / 60.0)
current_tokens = math.min(current_tokens + tokens_to_add, max_tokens)
-- TTL for cleanup
local ttl = math.ceil((max_tokens / refill_rate) * 60) + 60
if current_tokens >= cost then
-- Allowed: consume tokens
local new_tokens = current_tokens - cost
redis.call("SETEX", tokens_key, ttl, new_tokens)
redis.call("SETEX", time_key, ttl, now)
return {1, 0, math.floor(new_tokens)}
else
-- Denied: calculate retry_after
local tokens_needed = cost - current_tokens
local retry_after = tokens_needed / (refill_rate / 60.0)
redis.call("SETEX", tokens_key, ttl, current_tokens)
redis.call("SETEX", time_key, ttl, now)
return {0, retry_after, math.floor(current_tokens)}
end
8. Testing Rate Limits¶
Unit Testing Rules¶
def test_rate_limit_rule_validation():
rule = RateLimitRule(
max_tokens=5,
refill_rate=5.0,
scope=RateLimitScope.IP,
cost=1,
enabled=True,
)
assert rule.max_tokens == 5
assert rule.refill_rate == 5.0
assert rule.ttl_seconds == 120 # Calculated from capacity/rate
Integration Testing with Redis¶
async def test_rate_limit_allows_within_limit(redis_storage):
"""Test requests allowed when under limit."""
rule = RateLimitRule(max_tokens=5, refill_rate=5.0, ...)
for _ in range(5):
result = await redis_storage.check_and_consume(
key_base="test:ip:127.0.0.1",
rule=rule,
cost=1,
)
assert result.value[0] == True # allowed
async def test_rate_limit_denies_over_limit(redis_storage):
"""Test 6th request denied after 5."""
# ... consume 5 tokens ...
result = await redis_storage.check_and_consume(...)
assert result.value[0] == False # denied
assert result.value[1] > 0 # retry_after
API Testing¶
def test_login_rate_limit(client: TestClient):
"""Test login endpoint rate limiting."""
# First 5 requests succeed
for _ in range(5):
response = client.post("/api/v1/sessions", json={
"email": "test@example.com",
"password": "wrong_password",
})
assert response.status_code in [401, 201]
# 6th request rate limited
response = client.post("/api/v1/sessions", json={
"email": "test@example.com",
"password": "wrong_password",
})
assert response.status_code == 429
assert "Retry-After" in response.headers
9. Common Patterns¶
Pattern 1: IP-Scoped for Unauthenticated¶
# Login, registration, password reset
"POST /api/v1/sessions": RateLimitRule(
max_tokens=5,
refill_rate=5.0,
scope=RateLimitScope.IP, # IP address
cost=1,
enabled=True,
),
Pattern 2: User-Scoped for Authenticated¶
# API endpoints requiring auth
"GET /api/v1/accounts": RateLimitRule(
max_tokens=100,
refill_rate=100.0,
scope=RateLimitScope.USER, # User ID from JWT
cost=1,
enabled=True,
),
Pattern 3: Expensive Operations¶
# Report generation costs 5 tokens
"POST /api/v1/reports": RateLimitRule(
max_tokens=10,
refill_rate=10.0,
scope=RateLimitScope.USER,
cost=5, # Higher cost
enabled=True,
),
Pattern 4: Disable Rate Limit (Testing)¶
# Disable for specific endpoint
"GET /api/v1/health": RateLimitRule(
max_tokens=1000,
refill_rate=1000.0,
scope=RateLimitScope.GLOBAL,
cost=1,
enabled=False, # Disabled
),
Pattern 5: Admin Reset Rate Limit¶
@router.post("/admin/rate-limits/reset")
async def reset_rate_limit(
data: ResetRateLimitRequest,
current_user: User = Depends(get_current_user),
_: None = Depends(require_role(UserRole.ADMIN)),
rate_limit: RateLimitProtocol = Depends(get_rate_limit),
) -> None:
"""Admin: Reset rate limit for user/IP."""
await rate_limit.reset(
endpoint=data.endpoint,
identifier=data.identifier,
)
10. Adding New Rate Limit Rules¶
With the registry-based pattern, adding rate limits is automatic:
Step 1: Add Route to Registry¶
# src/presentation/routers/api/v1/routes/registry.py
RouteMetadata(
method=HTTPMethod.POST,
path="/new-endpoint",
handler=new_endpoint_handler,
rate_limit_policy=RateLimitPolicy.API_WRITE, # Assign existing policy
...
)
Step 2: (Optional) Create New Policy¶
If existing policies don't fit:
# 1. Add enum value in metadata.py
class RateLimitPolicy(str, Enum):
CUSTOM_NEW = "custom_new"
# 2. Add implementation in derivations.py
RateLimitPolicy.CUSTOM_NEW: RateLimitRule(
max_tokens=20,
refill_rate=20.0,
scope=RateLimitScope.USER,
cost=1,
enabled=True,
),
Step 3: Test Rule¶
# tests/integration/test_rate_limit_new_endpoint.py
async def test_new_endpoint_rate_limit():
# Test within limit
# Test at limit
# Test over limit
# Test refill behavior
Step 4: Document¶
Update API documentation with rate limit information.
Troubleshooting¶
429 Too Many Requests unexpectedly¶
- Check endpoint matches rule key exactly (method + path)
- Check scope - IP vs User vs User+Provider
- Check if rate limits were reset recently
- Check Redis connectivity
Rate limit not applied¶
- Check endpoint is in
RATE_LIMIT_RULES - Check
enabled=Truein rule - Check middleware is registered
- Check identifier extraction is correct
Retry-After shows wrong value¶
- Check
refill_rateconfiguration - Check server clock synchronization
- Check Lua script timestamp handling
Rate limit headers missing¶
- Check middleware is adding headers
- Check response isn't being replaced
- Check headers not stripped by proxy
Created: 2025-12-05 | Last Updated: 2026-01-10