Fast mode
A budget engine operating mode that reduces retrieval depth, reasoning complexity, and the number of agents dispatched to meet a latency target. Activated when computational resources are constrained or when the current operation's utility score falls below the threshold warranting full slow-mode processing.
The budget engine selects between fast mode and slow mode based on a utility threshold calculation for each operation. Fast mode trades reasoning quality for speed: it retrieves fewer memories, dispatches fewer agents (or skips some roles entirely), reduces world-model prediction depth, and may skip cross-encoder reranking. The goal is graceful degradation - the system remains functional under resource pressure rather than failing or timing out. The decision to enter fast mode is recorded in the audit stream and reflected in the cognitive session's telemetry. Operations processed in fast mode may produce lower-quality proposals that the critic flags more often, and their reinforcement signals are adjusted to account for the reduced deliberation depth. Persistent fast-mode triggering is a signal that the system needs more resources or that the event volume has exceeded its provisioned capacity.