April 9, 2026·6 min read·Stage 26·Cognitive Substrate

Curiosity Engine

This article describes the curiosity engine that rewards information gain, uncertainty reduction, novelty, and autonomous experimentation.

agents attention memory-systems policy-engine reinforcement typescript

← PreviousCausal Intelligence

26 / 48

Next →Dreaming System

Intrinsic motivation

Curiosity engine: uncertainty and novelty signals combine into information gain estimates and experiment proposals that pass through a safety and budget gate before execution.

External reward is not enough for a system that must operate in changing environments and handle situations it has never encountered. A system optimized purely for explicit task rewards will exploit what it knows and ignore unknowns, even when those unknowns might matter later. This is the exploration-exploitation tradeoff, and systems that resolve it entirely toward exploitation become brittle when conditions change.

Curiosity is the mechanism for intrinsic motivation: a drive to seek information, reduce uncertainty, and explore novel states that is not contingent on external reward signals. It drives exploration not because the system has been instructed to explore, but because unexplored states with potential information gain are themselves valuable.

Information gain as the core signal

The curiosity engine's primary signal is expected information gain: the reduction in uncertainty that a proposed action, observation, or memory retrieval would produce.

Experiment 25 demonstrated the engine's behavior across a four-phase incident lifecycle. In the normal phase (low novelty, low uncertainty, high visitedCount for all states), curiosity priorities were near 0.19 for all candidates. In the outage phase (novelty near 0.85, uncertainty near 0.85, visitedCount = 0 for outage states), the top curiosity priority reached 0.90, and five experiment proposals were generated.

The curiosity formula combines four factors:

$curiosity = infoGain \times 0.4 + novelty \times 0.25 + uncertainty \times 0.25 + \frac{1}{1 + visitedCount} \times 0.1$

Outage states scored $4.7\times$ higher than normal states ( $0.90$ vs $0.19$ ) because all four factors were simultaneously high: information gain was high (the system had no model of outage dynamics), novelty was high (outage states had not been seen before), uncertainty was high (predictions for outage outcomes were unreliable), and $visitedCount$ was zero.

The weights in the formula encode a specific priority ordering. Information gain ( $40\%$ ) is the dominant factor; the system prioritizes learning that reduces important uncertainties. Novelty and uncertainty are secondary ( $25\%$ each); they contribute but do not override information gain. Exploration drive from low visit count is a small bonus ( $10\%$ ) that prevents the system from completely ignoring rarely-visited states.

The difference between novelty and distractibility

A curiosity signal based only on novelty would produce a system that chases anything unexpected regardless of whether it is relevant. A red flag on a traffic sign and an incident alert from a production service are both novel; they are not equally worth investigating.

The information gain component addresses this. Information gain is not just "is this new?" but "does understanding this reduce uncertainty that matters?" Novelty is necessary but not sufficient for high curiosity priority. A novel signal that does not reduce any important uncertainty (it is just unusual, not informative about the things the system cares about) will score high on novelty but low on information gain.

This is the difference between curiosity and distractibility. Distractibility is driven by novelty alone. Curiosity is driven by expected learning value. The engine can exhibit either depending on the weight assigned to the novelty term, which is why the relative weights matter and why they were chosen carefully rather than set by intuition.

Autonomous experiment proposals

When curiosity priority is high, the engine proposes experiments: bounded actions designed to gather information rather than to achieve a goal. An experiment might be an active inference probe (described in the grounding article), an additional retrieval query, an alternative plan tested in simulation (described in the dreaming article), or a controlled action with a small blast radius.

Experiment 25 generated five proposals during the outage phase. Each proposal included the proposed action, the expected information gain, the uncertainty it was targeting, and the resources it would consume.

Experiment proposals are subject to the same governance as any other proposed action. They go through the budget gate (is the expected information gain worth the resource cost?), the constitutional check (does the experiment violate any invariants?), and the attention system (is the timing appropriate?). The curiosity engine proposes; it does not execute unilaterally.

This constraint prevents curiosity from becoming disruptive. A production system with unlimited curiosity-driven exploration would generate constant noise and potentially harmful interventions. Constrained curiosity, operating within the same resource and safety limits as all other operations, generates exploration that is proportional to its expected value.

Exploration priority in attention and retrieval

Curiosity influences both the attention system and the memory retrieval system, not just the action proposal system.

In attention: high-curiosity items receive elevated salience. An unknown but potentially informative signal receives more attention than a familiar but expected one. This is how the system notices novel situations before they escalate into crises.

In retrieval: the session-relative novelty mechanism described in the memory retrieval article is a form of curiosity applied to memory. Memories not recently accessed are novel relative to the current session, and novelty weight adds retrieval priority to them. This is curiosity over the memory landscape: the system is motivated to revisit neglected areas of its knowledge base.

These two channels together create a system with both active curiosity (proposing experiments, seeking novel observations) and passive curiosity (remaining open to unexpected signals and neglected memories). The combination is more robust than either alone.

The curiosity-consolidation relationship

Curiosity targets are natural candidates for consolidation. When the system explores an unknown state and reduces its uncertainty about it, the resulting experience is high-value new information. The consolidation system should prioritize consolidating these high-information-gain experiences into semantic memories that future retrieval can access.

The feedback loop is: curiosity identifies high-uncertainty states, exploration generates experience in those states, consolidation transforms the experience into semantic memories, future retrieval produces those memories when similar states are encountered, and the previously high-uncertainty state becomes known. The curiosity priority for that state drops as visitedCount rises and uncertainty falls.

This feedback loop is how the system learns. Curiosity drives exploration of the unknown. Experience in unknown states produces high-value memories. Consolidation of those memories reduces future uncertainty. Reduced uncertainty decreases curiosity for those states and frees curiosity resources for the next unknown frontier.

← PreviousCausal Intelligence

26 / 48

Next →Dreaming System

Next up from memory

Ranked from series and tags, warmed by what the substrate is keeping salient across readers.

Feb 26, 2026Cognitive SubstrateIdentity FormationThis article describes the formation of a longitudinal identity model from reinforced experience, policy drift, and narrative coherence.Apr 21, 2026Cognitive SubstrateOperational PrimitivesThe operational primitive taxonomy: a closed, system-agnostic vocabulary that maps vendor telemetry from Kafka, OpenSearch, PostgreSQL, and ClickHouse into portable pattern signatures for cross-environment operational intelligence.Mar 28, 2026Cognitive SubstrateMeta-CognitionThis article extends the reflection loop into calibrated monitoring of cognitive operations, failure attribution, introspection budgeting, and watchdog agents.