SIMULATION ENGINE · CRISIS-OF-THE-WEEK

200 agents. 90 days. 90 seconds.

Foresight is the multi-agent crisis simulation engine inside Sentinel — the part that answers "what happens to our supply chain if X breaks" with a quantified, agent-grounded, board-ready brief. This is the engineering brief on the code that makes that possible.

"Most multi-agent demos collapse not because the agents are bad, but because the orchestration is naïve — sequential calls, no batching, no cancellation, no streaming, no model tiering. The engine is the product." — Field engineering note · Sentinel Foresight build log · 2026-03-22

Why naïve multi-agent fails · and how Foresight wins

Naïve agent loop

What breaks at scale

Agents called sequentially — N × latency
One slow agent stalls the whole step
One exception crashes the run
No mid-run cancellation
One model tier for everything — cost runaway
Results dumped at the end — no live feedback
Personas hand-coded — no scenario grounding
POC works at 10 agents. Dies at 100.

Sentinel Foresight engine

What the engine delivers

Up to 200 agents · batches of 20 in parallel
Exception-isolated per agent — bad apple, good barrel
asyncio.gather with return_exceptions=True
Cancellation via DB-status check per step
Three model tiers · local for reasoning, Claude for synthesis
Live SSE stream via Redis pub/sub — UI sees every step
Personas generated from world snapshot + scenario config
Customer 1 and customer 300 run the same engine

Architecture · the per-step loop

SCENARIO ENGINE 20 templates · commodity / logistics / macro / regulatory / custom NL parser ↓ (events_for_step) WORLD BUILDER customer ERP snapshot · commodity baselines · doc RAG (PDF / DOCX / XLSX) ↓ (immutable WorldSnapshot → mutable world_state) PERSONA GENERATOR commodity · financial · logistics · procurement · regulatory · supplier ↓ (agent allocation tuned to scenario category) SIMULATION LOOP for step in horizon · batch(20) · asyncio.gather · merge actions ↓ (per-step actions + mutated world_state) REDIS PUB/SUB step broadcast → FastAPI SSE → frontend live timeline ↓ SYNTHESIZER Claude-tier synthesis of N×days actions → executive brief ↓ SENTINEL UI Foresight tab · Command Center · brief export (PDF)

Ten engineering principles

Async-first FastAPI

psycopg AsyncConnectionPool + redis.asyncio + httpx.AsyncClient. No blocking I/O on the simulation path, ever.

Bounded parallel batches

AGENT_BATCH_SIZE = 20. asyncio.gather per batch with return_exceptions=True. 200 agents finish in ten waves, not one stampede.

Exception-isolated agents

_safe_step wraps every agent. A failing agent returns None and the batch keeps going. One bad model output never kills the run.

Three-tier model routing

Local 27B for agent reasoning (MLX on Studio :8094 · ~$0/sim). Anthropic fallback for resilience. Claude-tier synthesis for the final brief.

Cancellable mid-flight

Every step polls the simulation row's status column. If a user hits cancel, the engine bails clean — no orphaned tasks, no zombie cost.

Redis pub/sub streaming

_broadcast_step pushes each step to a Redis channel. The frontend tails it via SSE. Users watch the crisis unfold live — no polling.

Scenario as data, not code

Twenty templates ship in scenarios.py. Each declares events keyed by step number. A custom NL scenario parser turns plain English into the same event list.

Persona grounding

Personas are generated from the live world snapshot + scenario config, not hand-coded. The supplier agent reasons about your actual suppliers, not a fixture.

Document-RAG world

Customer PDFs, DOCX, and XLSX feed the world model — pypdf2, python-docx, openpyxl. Agents argue with the customer's own contracts and price files in the loop.

Idempotent persistence

Every step writes its actions to Postgres before broadcast. A crashed run resumes from the last persisted step — no lost work, full audit trail.

The actual stack · taken straight from main.py

API

FastAPI async endpoints · uvicorn ASGI · StreamingResponse for SSE · pydantic v2 request/response models.

Concurrency

asyncio · asyncio.create_task for background runs · asyncio.gather for parallel agents · per-step cancellation polling.

Data

psycopg[binary] + psycopg_pool.AsyncConnectionPool (min=2, max=10) · redis.asyncio pub/sub · simulations + simulation_steps tables.

Models

httpx.AsyncClient against Studio MLX (Qwen3.6-27B, OpenAI-compatible API on :8094) · anthropic async fallback · synthesis via Claude Sonnet or MiniMax M2.7.

Docs

pypdf2 · python-docx · openpyxl · python-multipart for upload · processed in document_processor.py into the world snapshot.

Scenarios

scenarios.py declares 20 ScenarioTemplate dataclasses across commodity / logistics / macro / regulatory categories. Each declares events keyed by step. Custom scenarios parse natural language into the same shape.

Personas

persona_generator.py emits AgentPersona objects with behavioral_traits, decision_thresholds, relationships, and initial_state — composed from world + scenario, not hand-typed.

Deploy

Dockerfile · standalone microservice on port 5060 · connects to Sentinel backend via REST · zero shared process state with the Node API.

SLAs · what the engine commits to

200

max agents per simulation

365

max horizon days

parallel agents per batch

~$0

marginal cost · local models

Foresight is a Sentinel platform service — a standalone FastAPI microservice on port 5060, decoupled from the Node backend so the engine can scale, restart, or be model-swapped without touching the rest of the platform. The simulation loop is the product; the agent personas, scenario templates, and synthesis are pluggable around it. Working end-to-end since 2026-03-22 — 25-agent triple-crisis successfully run on the first overnight pass. Build sequence: world builder → persona generator → base agent → simulation loop → scenario engine → synthesizer → SSE → UI. Six modules. Sixteen Python files. Sub-100 lines per file on the hot path.

Full technical spec available on request. Tail the live SSE stream at GET /simulate/{id}/events.