SIMULATION ENGINE · CRISIS-OF-THE-WEEK
200 agents. 90 days. 90 seconds.
Foresight is the multi-agent crisis simulation engine inside Sentinel — the part that answers
"what happens to our supply chain if X breaks" with a quantified, agent-grounded, board-ready brief.
This is the engineering brief on the code that makes that possible.
"Most multi-agent demos collapse not because the agents are bad, but because the orchestration is naïve — sequential calls, no batching, no cancellation, no streaming, no model tiering. The engine is the product."
— Field engineering note · Sentinel Foresight build log · 2026-03-22
Why naïve multi-agent fails · and how Foresight wins
Naïve agent loop
What breaks at scale
- Agents called sequentially — N × latency
- One slow agent stalls the whole step
- One exception crashes the run
- No mid-run cancellation
- One model tier for everything — cost runaway
- Results dumped at the end — no live feedback
- Personas hand-coded — no scenario grounding
- POC works at 10 agents. Dies at 100.
Sentinel Foresight engine
What the engine delivers
- Up to 200 agents · batches of 20 in parallel
- Exception-isolated per agent — bad apple, good barrel
- asyncio.gather with return_exceptions=True
- Cancellation via DB-status check per step
- Three model tiers · local for reasoning, Claude for synthesis
- Live SSE stream via Redis pub/sub — UI sees every step
- Personas generated from world snapshot + scenario config
- Customer 1 and customer 300 run the same engine
Architecture · the per-step loop
SCENARIO ENGINE 20 templates · commodity / logistics / macro / regulatory / custom NL parser
↓ (events_for_step)
WORLD BUILDER customer ERP snapshot · commodity baselines · doc RAG (PDF / DOCX / XLSX)
↓ (immutable WorldSnapshot → mutable world_state)
PERSONA GENERATOR commodity · financial · logistics · procurement · regulatory · supplier
↓ (agent allocation tuned to scenario category)
SIMULATION LOOP for step in horizon · batch(20) · asyncio.gather · merge actions
↓ (per-step actions + mutated world_state)
REDIS PUB/SUB step broadcast → FastAPI SSE → frontend live timeline
↓
SYNTHESIZER Claude-tier synthesis of N×days actions → executive brief
↓
SENTINEL UI Foresight tab · Command Center · brief export (PDF)
Ten engineering principles
01
Async-first FastAPI
psycopg AsyncConnectionPool + redis.asyncio + httpx.AsyncClient. No blocking I/O on the simulation path, ever.
02
Bounded parallel batches
AGENT_BATCH_SIZE = 20. asyncio.gather per batch with return_exceptions=True. 200 agents finish in ten waves, not one stampede.
03
Exception-isolated agents
_safe_step wraps every agent. A failing agent returns None and the batch keeps going. One bad model output never kills the run.
04
Three-tier model routing
Local 27B for agent reasoning (MLX on Studio :8094 · ~$0/sim). Anthropic fallback for resilience. Claude-tier synthesis for the final brief.
05
Cancellable mid-flight
Every step polls the simulation row's status column. If a user hits cancel, the engine bails clean — no orphaned tasks, no zombie cost.
06
Redis pub/sub streaming
_broadcast_step pushes each step to a Redis channel. The frontend tails it via SSE. Users watch the crisis unfold live — no polling.
07
Scenario as data, not code
Twenty templates ship in scenarios.py. Each declares events keyed by step number. A custom NL scenario parser turns plain English into the same event list.
08
Persona grounding
Personas are generated from the live world snapshot + scenario config, not hand-coded. The supplier agent reasons about your actual suppliers, not a fixture.
09
Document-RAG world
Customer PDFs, DOCX, and XLSX feed the world model — pypdf2, python-docx, openpyxl. Agents argue with the customer's own contracts and price files in the loop.
10
Idempotent persistence
Every step writes its actions to Postgres before broadcast. A crashed run resumes from the last persisted step — no lost work, full audit trail.
The actual stack · taken straight from main.py
API
FastAPI async endpoints · uvicorn ASGI · StreamingResponse for SSE · pydantic v2 request/response models.
Concurrency
asyncio · asyncio.create_task for background runs · asyncio.gather for parallel agents · per-step cancellation polling.
Data
psycopg[binary] + psycopg_pool.AsyncConnectionPool (min=2, max=10) · redis.asyncio pub/sub · simulations + simulation_steps tables.
Models
httpx.AsyncClient against Studio MLX (Qwen3.6-27B, OpenAI-compatible API on :8094) · anthropic async fallback · synthesis via Claude Sonnet or MiniMax M2.7.
Docs
pypdf2 · python-docx · openpyxl · python-multipart for upload · processed in document_processor.py into the world snapshot.
Scenarios
scenarios.py declares 20 ScenarioTemplate dataclasses across commodity / logistics / macro / regulatory categories. Each declares events keyed by step. Custom scenarios parse natural language into the same shape.
Personas
persona_generator.py emits AgentPersona objects with behavioral_traits, decision_thresholds, relationships, and initial_state — composed from world + scenario, not hand-typed.
Deploy
Dockerfile · standalone microservice on port 5060 · connects to Sentinel backend via REST · zero shared process state with the Node API.
SLAs · what the engine commits to
200
max agents per simulation
20
parallel agents per batch
~$0
marginal cost · local models