Steve HutchinsonBig Pines

Arbitration

The process that compares proposals from multiple agents, scores candidates across four dimensions (coherence, predicted reward, memory alignment, risk), and selects a final decision for execution. Memory alignment is capped at five memories to prevent quantity from substituting for quality.

Arbitration is the decision layer that sits between multi-agent deliberation and actual execution. After the planner, executor, critic, and world-model agents have each contributed their outputs, arbitration scores every candidate proposal across four dimensions: coherence (is the proposal internally consistent and aligned with current goals?), predicted reward (what reinforcement outcome does the world model anticipate?), memory alignment (how many retrieved memories support this proposal, capped at five?), and risk (what is the probability and severity of negative outcomes?). The cap on memory alignment is significant: it prevents a proposal backed by many low-quality memories from outscoring one backed by a few high-quality ones. Arbitration weights are themselves part of the policy state and can drift over time through reinforcement - if the current weights consistently select proposals that lead to poor outcomes, the policy engine adjusts them. The full arbitration decision, including scores on each dimension and the winning proposal, is recorded in the audit stream.

This site collects anonymous usage data to understand how people read and navigate the blog. Accepting enables persistent reader preferences across visits.