Event Streams as Memory Formation

There is a way to read Kafka that most engineers do not use.

The standard reading is operational: Kafka is a durable, distributed message queue. Producers write. Consumers read. Brokers hold data until the retention period expires. Throughput is the goal.

The alternative reading is cognitive: a Kafka topic is a log of events ordered in time, each event a snapshot of some system's state at a particular moment. The log never changes. The past is immutable. The present is the position of the consumer's offset.

This second reading is not metaphor. It is a precise structural description that turns out to be useful for building systems that learn.

The Log as Episode

In cognitive neuroscience, episodic memory is memory of specific events in time and place. You do not remember "dogs in general." You remember a specific dog, on a specific afternoon, at a specific location.

A Kafka topic has this structure. Each message is an event - timestamped, keyed, placed at a specific offset. The topic is the episode: a time-ordered record of everything that happened in a particular domain.

The consumer, by choosing where to start reading, is choosing which period of the past to replay. fromBeginning: true is total recall. A specific offset is a surgical re-entry into history.

Producing Events

The producer in this system sends typed telemetry events - page views, scroll depths, article completions - each with a session ID and timestamp:

File not found: examples/kafka/producer.ts

The sessionId is the key. Events from the same reading session are co-located in the same partition, which means they can be consumed in order without cross-partition coordination.

Consuming and Processing

The consumer reads the same topic from the other side:

File not found: examples/kafka/consumer.ts

The switch on event.type is where the discriminated union earns its keep. The compiler ensures all cases are handled. Adding a new event type without handling it is a type error, not a runtime surprise.

Retention as Forgetting

Kafka's retention period is a form of forgetting. Events older than the retention window are deleted. This is not a limitation - it is a design parameter.

A system that retains everything forever is not a memory system. It is an archive. Memory is selective. It keeps what is relevant and lets the rest decay.

The appropriate retention for blog telemetry is probably 90 days. Long enough to observe patterns across multiple reading sessions, short enough that the storage stays bounded. The consolidation layer - ClickHouse for columnar analytics, OpenSearch for semantic indexing - extracts the signal before the raw events expire.

The Consumer Group as Parallel Processing

Kafka consumer groups allow multiple consumers to read from the same topic simultaneously, each handling a different subset of partitions. This is not just a throughput optimization - it is a model of parallel cognition.

Different consumers can process the same event stream for different purposes: one materializes to ClickHouse, one indexes to OpenSearch, one feeds a real-time attention model. The same episode, observed through multiple lenses simultaneously.

The same Kafka topic feeds three consumers simultaneously: columnar storage, semantic indexing, and real-time attention. Each observes the full event stream independently.

💡On anthropomorphism

Using words like "memory" and "forgetting" for infrastructure is not a claim about consciousness. It is a claim about structure: Kafka logs have the same temporal and selectivity properties as episodic memory systems. The analogy is computationally precise, not poetic.