May 24, 2026·3 min read·Kafka and the Registry

Why I Chose Kafka for the Cognitive Substrate

The Cognitive Substrate needs to ingest a constant stream of events from many different sources - telemetry, tickets, Slack, logs. Here is why Kafka is the right foundation for that, and what I quickly learned it was not enough on its own.

kafka architecture cognitive-substrate event-streaming infrastructure

1 / 7

Next →The Hidden Problem with Raw Kafka Messages

When building the memory layer for the Cognitive Substrate, I needed something that could reliably handle a constant stream of events from many different sources - telemetry, logs, tickets, Slack messages, and more - without any of those sources needing to coordinate directly with each other.

I chose Kafka for three main reasons.

Why Kafka

High-throughput event streaming - Kafka is built for this. It handles thousands of events per second comfortably without special tuning, and scales well beyond that when needed.

Producer-consumer decoupling - the telemetry system publishes events without needing to know which workers will process them or when. The ingestion worker processes events without needing to know about the telemetry system. They communicate only through the Kafka topic. This decoupling makes the architecture easy to extend: adding a new event source means writing a producer, not changing any existing consumers.

Durability and replay - once data is written to Kafka, it is retained for a configurable period and can be replayed from any offset. This is valuable for the Cognitive Substrate specifically: when I add a new processing worker, I can replay historical events to populate its state rather than starting from scratch.

Kafka as the Central Nervous System

Kafka acts as the central event bus for the entire Cognitive Substrate. Every experience - whether it originated as a telemetry spike, a support ticket, or a Slack thread - first lands in a Kafka topic. Workers consume from those topics at their own pace, process the events, and emit outputs to other topics.

This design gives resilience (workers can crash and restart without losing events), scalability (add workers to increase throughput), and auditability (the full event history is available for replay and debugging).

Here is the CognitiveProducer.publish method. Every message is automatically mirrored to the immutable audit topic - a pattern that would be impossible to enforce consistently with ad-hoc producers:

// packages/kafka-bus/src/producer.ts

export class CognitiveProducer {
  async publish<T>(topic: TopicName, payload: T, options: PublishOptions = {}): Promise<void> {
    const headers = options.traceContext ? injectTraceContext(options.traceContext) : {}

    const messages: Array<{ topic: string; messages: Message[] }> = [
      {
        topic,
        messages: [
          {
            key: options.key ? Buffer.from(options.key) : null,
            value: Buffer.from(JSON.stringify(payload)),
            headers,
          },
        ],
      },
    ]

    // Every message is automatically mirrored to the immutable audit topic,
    // except for audit messages themselves (which would cause infinite recursion).
    if (this.enableAuditMirror && topic !== Topics.AUDIT_EVENTS) {
      const auditPayload = { originalTopic: topic, payload, timestamp: new Date().toISOString() }
      messages.push({
        topic: Topics.AUDIT_EVENTS,
        messages: [
          {
            key: options.key ? Buffer.from(options.key) : null,
            value: Buffer.from(JSON.stringify(auditPayload)),
            headers,
          },
        ],
      })
    }

    await this.producer.sendBatch({ topicMessages: messages, compression: CompressionTypes.GZIP })
  }
}

The audit mirror is opt-in per producer instance but enabled by default. Any message published to any Cognitive Substrate topic creates a corresponding audit record with the original topic name, payload, and timestamp - providing an immutable event history without any per-call boilerplate.

The Problem I Did Not See Coming

Using Kafka this way works well until you have more than one producer writing events. Then you discover the problem: without a formal contract about what the events look like, one innocent change to a message structure can silently break every downstream consumer.

That is the problem the Schema Registry solves. The next article covers how I hit it and what it cost.

1 / 7

Next →The Hidden Problem with Raw Kafka Messages

Next up from memory

Ranked from series and tags, warmed by what the substrate is keeping salient across readers.

May 25, 2026Kafka and the RegistryHow Schema Evolution Actually WorksThe real power of a Schema Registry shows up when you need to change your message formats over time. Backward, forward, and full compatibility modes - and how to evolve schemas safely without coordinating simultaneous deployments.May 25, 2026Kafka and the RegistryWhat Is a Schema Registry and Why You Need OneA Schema Registry is a central repository for the structure of every message type flowing through Kafka. It enforces contracts, enables safe schema evolution, and turns your event stream into living documentation.Jul 11, 2026Kafka and the RegistryThe Registry in Practice: My ImplementationHow I actually use the Schema Registry in the Cognitive Substrate - Confluent Registry with Avro schemas, CI-enforced compatibility checks, schema-ID-based consumption, and the specific practices that made the pipeline reliable.