March 7, 2026·5 min read·Stage 12·Cognitive Substrate

Long-Horizon Goals

This article describes the goal system that organizes behaviour across multiple time horizons and feeds goal relevance back into reinforcement and retrieval.

agents attention memory-systems policy-engine reinforcement typescript

← PreviousWorld Model

12 / 48

Next →Multi-Agent Society

Why goals need hierarchy

Goal hierarchy: meta goals decompose down through long, mid, short, and micro horizons to the next action, with progress events feeding revision back up the chain.

A reactive agent can complete local tasks without preserving direction. It handles each input, produces a response, and moves on. Nothing connects the current action to a longer arc of purpose. Without structure across time, the agent has no way to prefer an action that is slightly worse now but enables much better actions later, no mechanism to resist short-term reward when it conflicts with longer-term value, and no way to detect that a series of individually reasonable decisions is collectively moving in the wrong direction.

Hierarchical goals solve this by giving the system explicit representations of what it is trying to accomplish across multiple timescales simultaneously.

The five goal horizons

The goal system organizes intention into five nested levels:

Micro goals guide the next action. At this level, goals are specific and immediate: retrieve context, propose a query, check a constraint. Micro goals are generated automatically from the current task and do not require long-term planning.

Short goals guide the current session. They represent what the system is trying to accomplish before the session ends: answer a specific question, diagnose a specific problem, complete a workflow. Short goals persist across multiple micro-goal cycles.

Mid goals organize projects. They represent objectives that span multiple sessions and require sustained effort over days or weeks. A mid goal might be "build reliable incident detection for the production cluster" or "identify and retire contradictory memories in the knowledge base."

Long goals preserve durable direction. They represent values, priorities, and strategic aims that should survive many mid-goal completions. A long goal might be "maintain high calibration on retrieval quality" or "prefer cautious action when evidence is uncertain."

Meta-goals regulate how goals themselves are selected, revised, and balanced. A meta-goal might say "when short and long goals conflict, prefer long goals unless the short-term cost is severe" or "review mid goals every fifty sessions for continued relevance."

Goal decomposition and upward revision

Longer goals are decomposed into shorter subgoals. Decomposition lets the agent connect a distant objective to the next executable step. Without it, a long goal like "improve retrieval calibration" cannot influence which memories to retrieve, which queries to try, or which experiments to run.

Decomposition is not one-way. Progress or failure at lower levels can revise higher-level expectations. If a mid goal turns out to be based on a false assumption (the system discovers that its retrieval calibration is fine but its consolidation quality is the actual problem), the mid goal should be revised. If repeated attempts to achieve a short goal all fail, the short goal may need to be replaced with a different decomposition of the same mid goal.

This bidirectional revision prevents the goal system from becoming a rigid plan that the system is committed to regardless of what it learns. Goals are hypotheses about how to achieve higher-level objectives, and they should update when the evidence warrants it.

Goals as active context

Selected goals become part of the context hydrated into the agent loop. This makes goals operational rather than decorative. When the memory agent retrieves relevant experience, it can weight memories that relate to active goals more heavily. When the planner proposes a strategy, it can reason explicitly about whether the strategy advances the current goal. When arbitration scores candidate actions, goal alignment is a factor.

This integration creates the long-horizon continuity that distinguishes a cognitive agent from a reactive one. Each decision is made with awareness of what the system is trying to accomplish, not just what the current input is asking for.

Goal progress as a reinforcement signal

Goal progress is tracked through goal.progress events. These events record movement toward a goal, blockers that prevented progress, goal completion, and regression (when earlier progress is lost due to new information or a bad action).

Progress signals feed directly into reinforcement scoring. An experience that advances an important long goal becomes more valuable for future retrieval and policy learning than one that only addresses an immediate micro goal. This means the reinforcement system naturally allocates higher priority to memories that contributed to sustained goal progress, creating a preference for strategies that were useful not just now but across many sessions.

Regression signals are equally important. A regression event indicates that the system made a decision that undid previous progress toward a goal. Frequent regressions on a specific goal type are evidence that the system's strategy for that goal class is unreliable, which can trigger additional retrieval, reflection, or world-model simulation before future attempts.

Why goals arrive at this point in the series

The goal system is introduced after multi-agent decomposition and arbitration but before the full self-regulation stack. This placement is not arbitrary.

Goals require memory (to track progress across sessions), policy (to encode goal-related behavioral preferences), and arbitration (to weight goal-relevant candidates over goal-irrelevant ones). All three are established by this point.

Goals also create the motivation structure that makes attention, temporal cognition, and cognitive economics meaningful. Attention without goals is just salience; with goals, it is directed priority allocation. Temporal cognition without goals is just urgency; with goals, it is deadline management relative to what the system is trying to achieve. Each later stage in the self-regulation arc makes more sense once the goal structure that motivates it is in place.

The next article describes the multi-agent society: how specialized agent roles are organized into a runtime society with an orchestrator, typed interfaces, and horizontal scalability, making the first fully deployable stage of the architecture.

← PreviousWorld Model

12 / 48

Next →Multi-Agent Society

Next up from memory

Ranked from series and tags, warmed by what the substrate is keeping salient across readers.

Feb 26, 2026Cognitive SubstrateIdentity FormationThis article describes the formation of a longitudinal identity model from reinforced experience, policy drift, and narrative coherence.Apr 21, 2026Cognitive SubstrateOperational PrimitivesThe operational primitive taxonomy: a closed, system-agnostic vocabulary that maps vendor telemetry from Kafka, OpenSearch, PostgreSQL, and ClickHouse into portable pattern signatures for cross-environment operational intelligence.Apr 9, 2026Cognitive SubstrateCuriosity EngineThis article describes the curiosity engine that rewards information gain, uncertainty reduction, novelty, and autonomous experimentation.