April 25, 2026·4 min read·Stage 33·Cognitive Substrate

Telemetry Ingestion Worker

The telemetry ingestion worker: how raw infrastructure metrics are persisted to ClickHouse and translated into operational primitive events, with intentional discard semantics and dual Kafka output streams.

attention clickhouse kafka opensearch reinforcement telemetry

← PreviousClickHouse Telemetry Layer

33 / 48

Next →Pattern Detection Worker

From metrics to cognitive events

Telemetry ingestion pipeline: raw metrics are persisted, grouped by service, mapped through a system mapping to primitive events, then written to ClickHouse and published on two Kafka streams.

The telemetry ingestion worker is the first executable bridge between observability data and the cognitive architecture. Raw metric messages arrive on telemetry.metrics.raw. Each message names a service, service type, metric name, value, timestamp, environment, and optional labels, baseline, and previous value.

The worker performs two acts at once. It preserves the raw signal in ClickHouse, then translates the signal into the operational primitive vocabulary introduced in Stage 30. The first act protects replay and auditability. The second act makes the signal useful to pattern detection.

Batch processing path

The worker processes each batch in four stages.

First, every input message is written to metrics_raw. The row preserves service_id, service_type, metric_name, numeric value, labels, timestamp, and environment. This write happens before mapping so discarded or unmapped signals remain available for mapping analysis.

Second, messages are grouped by inferred system identifier. Kafka, OpenSearch, PostgreSQL, and ClickHouse services resolve to built-in Aiven mappings. Additional mappings can be supplied through extraMappings, which lets a deployment add new system types without rebuilding the worker package.

Third, each group is passed through normaliseSignals(). The normaliser resolves metric names to primitive identifiers, computes intensity, derives trend from the previous value, infers scope from labels, and assigns confidence based on whether resolution was exact or wildcard-based.

Fourth, the worker writes cognitive_events to ClickHouse and publishes two Kafka streams: telemetry.events.normalized for resolved raw telemetry and cognition.primitives for downstream cognitive consumers.

Intentional discard semantics

Signals with no mapping do not become primitive events. This is an intentional boundary, not an error. Pattern detection relies on a stable vocabulary, and unmapped metrics would introduce noise if they were forced into weak categories.

The raw row is still stored. That creates a feedback path for mapping maintenance: unmapped metric names can be analyzed later, added to a SystemMapping, and replayed through the updated pipeline when needed.

Why the worker publishes two streams

telemetry.events.normalized retains the relationship between a raw metric and the primitives it resolved to. It is useful for dashboards, mapping validation, and debugging.

cognition.primitives is smaller and more abstract. It carries only the primitive event fields required by the pattern worker: primitive identifier, intensity, trend, scope, confidence, source system, correlated signal identifiers, and timestamp.

Separating the streams keeps the pattern detector independent from raw metric vocabulary. It also lets operational dashboards show the translation step without coupling the detector to presentation concerns.

Mapping extensibility

The worker uses built-in mappings for Aiven Kafka, OpenSearch, PostgreSQL, and ClickHouse. The same interface accepts deployment-specific mappings. Exact metric names take priority over wildcard patterns, allowing precise overrides without losing broad coverage.

This extension point is the operational form of the transfer mechanism. Onboarding a new system requires a mapping, not a new detector. Once the mapping produces primitive events, existing pattern knowledge can operate immediately.

The value of intentional discard

The decision to silently drop unmapped metrics rather than forcing them into weak categories has an important consequence for trust. The pattern worker can rely on the guarantee that every primitive event in cognition.primitives was resolved through a deliberate mapping decision, not through a best-guess fallback. Pattern confidence scores mean something only when the underlying signals are reliable.

The preserved raw row provides the complementary guarantee: no signal is lost permanently. When the mapping layer is later extended to cover a metric that was previously discarded, the new mapping can be validated by examining the historical raw rows before it is deployed to production. This is the same replay capability that the ClickHouse design enables at the incident level, applied here at the mapping validation level.

← PreviousClickHouse Telemetry Layer

33 / 48

Next →Pattern Detection Worker

Next up from memory

Ranked from series and tags, warmed by what the substrate is keeping salient across readers.

Apr 28, 2026Cognitive SubstratePattern Detection WorkerThis article describes the worker that detects operational failure patterns from streams of operational primitive events and emits recommendations.Apr 21, 2026Cognitive SubstrateOperational PrimitivesThe operational primitive taxonomy: a closed, system-agnostic vocabulary that maps vendor telemetry from Kafka, OpenSearch, PostgreSQL, and ClickHouse into portable pattern signatures for cross-environment operational intelligence.May 2, 2026Cognitive SubstrateIntelligence TransferHow operational knowledge learned in one infrastructure environment transfers to another: the system-mapping boundary, zero-shot pattern application, local confidence calibration, and what cannot transfer.