90-Day Build Plan

Three phases

Phase 1 Days 1–30 in progress

Spine

A continuation can be created, sleep, wake, run a no-op step, and complete. On both clouds. Same test suite passes.

Define Substrate interface
Continuation data model & event types
Azure: Cosmos DB store + lease
AWS: DynamoDB store + lease
Append-only event log
Scheduler v0 (timer wakes)
Worker pod (no-op tick)
Public API v0
Warm store CDC into Redshift / Synapse
Observability spine

Phase 2 Days 31–60 queued

Field & Policy

A continuation reads from a real context field, the policy kernel routes model calls, real money is spent against real budgets.

Semantic channel (dairy corpus indexed)
Stage-1 recall across all channels
Stage-2 cross-encoder reranker
Field materialization service
Policy kernel v0 (rules-only)
Tiered model dispatch (Bedrock + Foundry Models)
Foundry Models inference provider
Tool-schema normalization across providers
Budget enforcement (soft-cap downshift, hard-cap terminate)
End-to-end evidence-brief pilot
Notebook UI v0
Cargill tenant onboarding

Phase 3 Days 61–90 queued

Continuity & Learning

Wake on streams and signals. Forks and merges. Policy kernel trains on real usage. Two Cargill consultants run real briefs for two weeks.

Stream-subscription wakes
Sibling-publish + human-signal wakes
Forking and merging with lineage integrity
Policy kernel v1 (GBT training pipeline)
Policy kernel v1 shadow deployment
Tool stream adapters (PubMed, JDS, Cargill internal)
Researcher notebook v1 (full workflow)
Two-consultant Cargill pilot
Maestro public site (this site)
ADR backlog + Days-91+ handoff
Foundry Agent Service substrate adapter
AgentCore substrate adapter

Cross-cutting concerns

These run across all three phases and are not optional. Every PR is expected to address them where relevant.

Security: Short-lived tokens, tenant isolation at store layer, PII redaction on write, compliance freeze switch.
Testing: Substrate suite against both clouds in CI. Chaos tests for lease expiry, worker death, stream replay. Nightly E2E.
Cost discipline: Per-tenant daily $ cap at ingress. Field reranker is the #1 watched metric. Policy kernel on CPU only.
Vendor delegation: Continuations live in the Maestro worker pool by default. Delegation to AgentCore or Foundry Agent Service is opt-in per tick, gated on session-ceiling and tenant policy class, recorded as runtime: maestro | agentcore | foundry on every model_call event.
Documentation: README per package. ADRs for every load-bearing decision. If code disagrees with the architecture doc, one of them is wrong.

Working with Claude Code

The plan is designed to be consumed by Claude Code. Pick the lowest-numbered task whose dependencies are met, look at deliverables and acceptance, implement, verify, ship. One task per branch. PR title is the task id and title.

Three rules that are not negotiable:

No LLM calls on the policy kernel critical path.
No cloud SDK imports in core/.
Every state change is an event in the event log. No exceptions.