Day 11: Chaos Testing Framework
What I Built
- Chaos testing framework with 6 failure scenarios
- IBKR connection loss simulation
- Database timeout handling
- Rate limiting graceful backoff
- AI model timeout fallback
- Stale market data detection
- Network jitter retry logic
- Chaos report generation
Code Highlight
# Chaos testing scenarios
class ChaosScenario:
"""Base class for chaos test scenarios."""
def __init__(self, name: str, description: str):
self.name = name
self.description = description
async def execute(self, *args, **kwargs):
raise NotImplementedError
# Example: Network Jitter Scenario
class NetworkJitterScenario(ChaosScenario):
def __init__(self):
super().__init__(
"Network Jitter",
"Intermittent network failures; retry logic recovers"
)
@pytest.mark.asyncio
async def execute(self, ibkr_client):
"""Simulate network jitter (fails then succeeds)."""
call_count = 0
async def flaky_submit_order(*args, **kwargs):
nonlocal call_count
call_count += 1
if call_count == 1:
raise ConnectionError("Network jitter")
return {"orderId": "12345"}
with patch.object(ibkr_client, 'submit_order', side_effect=flaky_submit_order):
# Retry should succeed on second attempt
result = await retry_with_backoff(
ibkr_client.submit_order,
max_retries=3,
base_delay=0.1,
exceptions=(ConnectionError,)
)
assert result["orderId"] == "12345"
assert call_count == 2
Architecture Decision
Chaos testing is crucial for production reliability. Rather than testing happy paths, we simulate real-world failures to ensure graceful degradation. The framework uses pytest fixtures for clean test isolation and focuses on the six most critical failure modes: external service outages, database issues, rate limits, AI timeouts, data staleness, and network instability.
Testing Results
All 6 chaos tests passing, covering critical failure scenarios:
- ✅ IBKR Connection Loss - ConnectionError handling
- ✅ Database Timeout - AsyncSession timeout simulation
- ✅ Rate Limiting - HTTPException 429 handling
- ✅ AI Model Slowness - TimeoutError with fallback
- ✅ Stale Market Data - Cache expiration detection
- ✅ Network Jitter - Retry logic with backoff
Next Steps
Day 12: Integration testing with full trade flow mocking and E2E test setup.
Follow @therealkamba on X for regular updates. View all posts →