matrix-agent-validated

What this mode does

Runs the scrum master pipeline: walks target source files, compares each to the PRD via cloud LLMs (Grok 4.1 / DeepSeek V4 / Qwen3-235B with same-model retry), observer hand-reviews each candidate, applier lands surgical patches that pass cargo-green + warning-stable + rationale-aligned gates. Designed for finding code drift against the PRD and committing only verified fixes.

Run a scrum loop

Target files (comma-sep):

Max iters:

no scrum loop running

Recent scrum reviews (per file)

…

Recent applier outcomes

…

Recent loop iterations

…

What this mode does

System health + matrix integrity. Probes each service, audits the pathway memory state file, lists vector indexes, shows raw bucket inventory. Use when something feels wrong or you want to confirm the state of the data plane.

Pathway memory size

…

growing = traces being added; shrinking = retire/revise consolidation

Vector indexes

…

corpora the matrix retrieves from

Pathway traces

…

total / retired / replays

Service detail

…

Raw bucket (s3://raw)

…

What this mode does

Runs Chicago contract analysis as a benchmark — picks N high-cost permits, queries the matrix across 6 corpora (chicago_permits + entity_brief + sec_tickers + llm_team_runs + llm_team_response_cache + distilled_procedural), produces structured staffing recommendations. Each result gets observer hand-review (cloud or heuristic). Use to compare retrieval quality across corpus configurations.

Run benchmark

Permits to analyze:

no benchmark running

Recent observer verdicts

…

Recent contract analyses

…

Notes

The contract analyzer is a general-purpose harness for any task class with vectorized corpora. Currently wired to Chicago permits, but the same pipeline can target a different domain by:

Adding the corpus to raw/ bucket + vectorize
Updating CONTRACT_CORPORA in analyze_chicago_contracts.ts
Editing the prompt template for the new domain

What this mode does

Tracks the L4 deployment pipeline — taking the proven L1-L3 codebase and standing up a fresh production instance (S3 + database + hybrid search + agent runtime) on a different server via Ansible. Each stage shows status + evidence so you know what's done and what's pending.

Deployment stages

…

Next: telephony autonomous setup

Per your project plan: Asterisk + Pipecat + gpt-realtime + pgvector memory in 5-layer telephony architecture. Not in this checkpoint — would extend the playbook with new roles for Asterisk install, Pipecat container, voice routing config. Tracks separately under L5 (production telephony agent).

What this mode does

Active agent runtime — fire the agent harness, watch it execute step-by-step, see what gets sealed to the matrix. The harness uses the proven cross-machine path (container as agent, source-box as data plane). Each successful run UPSERTs to pathway_memory; failed runs can be RETIRED.

Run agent harness

no agent running

Archived sessions

…

Latest trace (last run)

…

Latest final output

…

⚡ MATRIX-AGENT-VALIDATED

What this mode does

Run a scrum loop

Recent scrum reviews (per file)

Recent applier outcomes

Recent loop iterations

What this mode does

Service detail

Raw bucket (s3://raw)

What this mode does

Run benchmark

Recent observer verdicts

Recent contract analyses

Notes

What this mode does

Deployment stages

Next: telephony autonomous setup

What this mode does

Run agent harness

Archived sessions

Latest trace (last run)

Latest final output