Context Field — Maestro

Why fields, not queries

Today's RAG pipelines put the agent in a pull model: the agent decides what to retrieve, fires a query, gets results, formats them. Human researchers don't work that way — relevant context surfaces to them based on what they're doing. A context field is the streaming materialized view that makes that surfacing possible.

The continuation does not query memory. It reads the top of its field, which is always fresh, ranked, and within a token budget. The big model never sees the memory hierarchy directly — it sees the field.

Channels feeding the field

Channel	Underlying store	Latency	Candidate type
`semantic`	Qdrant / OpenSearch / Azure AI Search	< 50 ms	document chunks
`structured`	Postgres / Cosmos DB	< 20 ms	facts, rows, entities
`episodic`	Continuation event log	< 30 ms	prior decisions, findings
`graph`	Neo4j / Neptune	< 100 ms	entity-relation paths
`stream`	NATS / Kafka / Event Grid	push	newly-arrived items
`human`	Notebook annotations	push	researcher notes, corrections

The ranking cascade

Stage 1 — cheap recall

For each channel, fetch top-K candidates by channel-native similarity. Roughly 200–500 candidates total. Cheap and fast.

Stage 2 — cross-encoder reranking

A small, fast model (Haiku-class or a fine-tuned cross-encoder) scores each candidate against the goal frame:

score(c, g) =
      w_rel  * relevance(c, g)              # cross-encoder output, [0,1]
    + w_rec  * recency_decay(c.ts, now)     # exp(-Δt/τ), τ per source
    + w_auth * source_authority(c.source)   # learned per-source prior
    + w_use  * prior_utility(c, g.tenant)   # was this useful before?
    + w_div  * diversity_penalty(c, picked) # MMR-style anti-redundancy
    - w_cost * inclusion_cost(c)            # tokens to include

Weights are per-tenant learnable parameters, initialized to sensible defaults and refined from the warm store. Cost discipline: cap reranks per continuation per N seconds, cache reranker outputs keyed on (goal_frame_hash, candidate_id).

Stage 3 — field materialization

The top-N items are assembled into a manifest with provenance and an evicted log for explainability.

{
  "field_id": "...",
  "continuation_id": "...",
  "computed_at": "...",
  "ttl_seconds": 300,
  "token_budget": 12000,
  "items": [
    {"rank": 1, "source": "semantic:pubmed", "tokens": 380,
     "score": 0.91, "provenance": "...", "content_ref": "..."},
    ...
  ],
  "evicted": [...]   # what was dropped, and why
}

Eviction and decay

The field is not a database — it forgets. Each item has a half-life that depends on its source and the continuation's pace. When a continuation publishes a finding that uses an item, its prior_utility is reinforced, raising its score in future recomputations for similar goal frames in the same tenant.

This gives Maestro organizational learning as an emergent property: items that prove useful for one consultant's dairy nutrition work get surfaced faster for the next.

Implementation: streaming materialized views

The naive implementation recomputes the field on every tick. The production implementation uses a streaming materialized view — Materialize or RisingWave on open source, or a Flink job on managed platforms. Channels publish change events; the view incrementally updates the field; the continuation reads the latest snapshot. The reranker is the dominant cost; everything else is cheap.