There is a way to read Kafka that most engineers do not use.
The standard reading is operational: Kafka is a durable, distributed message queue. Producers write. Consumers read. Brokers hold data until the retention period expires. Throughput is the goal.
The alternative reading is cognitive: a Kafka topic is a log of events ordered in time, each event a snapshot of some system's state at a particular moment. The log never changes. The past is immutable. The present is the position of the consumer's offset.
This second reading is not metaphor. It is a precise structural description that turns out to be useful for building systems that learn.
The Log as Episode
In cognitive neuroscience, episodic memory is memory of specific events in time and place. You do not remember "dogs in general." You remember a specific dog, on a specific afternoon, at a specific location.
A Kafka topic has this structure. Each message is an event - timestamped, keyed, placed at a specific offset. The topic is the episode: a time-ordered record of everything that happened in a particular domain.
The consumer, by choosing where to start reading, is choosing which period of the past to replay. fromBeginning: true is total recall. A specific offset is a surgical re-entry into history.
Producing Events
The producer in this system sends typed telemetry events - page views, scroll depths, article completions - each with a session ID and timestamp:
import { Kafka, type Producer } from 'kafkajs'
import type { TelemetryEvent } from '@bigpines/telemetry'
const kafka = new Kafka({
clientId: 'blog-telemetry-producer',
brokers: (process.env['KAFKA_BROKERS'] ?? 'localhost:9092').split(','),
})
let producer: Producer | null = null
async function getProducer(): Promise<Producer> {
if (!producer) {
producer = kafka.producer({
allowAutoTopicCreation: false,
transactionTimeout: 30_000,
})
await producer.connect()
}
return producer
}
export async function publishEvent(event: TelemetryEvent): Promise<void> {
const p = await getProducer()
await p.send({
topic: process.env['KAFKA_TOPIC'] ?? 'blog-events',
messages: [
{
key: event.sessionId,
value: JSON.stringify(event),
headers: {
type: event.type,
version: '1',
timestamp: event.timestamp,
},
},
],
})
}
export async function publishBatch(events: TelemetryEvent[]): Promise<void> {
if (events.length === 0) return
const p = await getProducer()
await p.send({
topic: process.env['KAFKA_TOPIC'] ?? 'blog-events',
messages: events.map((event) => ({
key: event.sessionId,
value: JSON.stringify(event),
headers: { type: event.type, version: '1' },
})),
})
}
export async function disconnectProducer(): Promise<void> {
if (producer) {
await producer.disconnect()
producer = null
}
}
The sessionId is the key. Events from the same reading session are co-located in the same partition, which means they can be consumed in order without cross-partition coordination.
Consuming and Processing
The consumer reads the same topic from the other side:
import { Kafka } from 'kafkajs'
import type { TelemetryEvent } from '@bigpines/telemetry'
const kafka = new Kafka({
clientId: 'blog-telemetry-consumer',
brokers: (process.env['KAFKA_BROKERS'] ?? 'localhost:9092').split(','),
})
const consumer = kafka.consumer({ groupId: 'blog-analytics-group' })
async function processEvent(event: TelemetryEvent): Promise<void> {
switch (event.type) {
case 'page_view':
console.log(`[page_view] ${event.payload.path} — session: ${event.sessionId}`)
break
case 'article_complete':
console.log(`[article_complete] ${event.payload.slug} — ${event.payload.readingTimeMs}ms`)
break
case 'scroll_depth':
console.log(`[scroll_depth] ${event.articleSlug ?? 'unknown'} — ${event.payload.depth}%`)
break
default:
console.log(`[${event.type}]`, JSON.stringify(event.payload))
}
}
export async function startConsumer(): Promise<void> {
await consumer.connect()
await consumer.subscribe({
topic: process.env['KAFKA_TOPIC'] ?? 'blog-events',
fromBeginning: false,
})
await consumer.run({
eachMessage: async ({ message }) => {
if (!message.value) return
try {
const event = JSON.parse(message.value.toString()) as TelemetryEvent
await processEvent(event)
} catch (err) {
console.error('[consumer] Failed to process message:', err)
}
},
})
}
process.on('SIGTERM', async () => {
await consumer.disconnect()
process.exit(0)
})
void startConsumer()
The switch on event.type is where the discriminated union earns its keep. The compiler ensures all cases are handled. Adding a new event type without handling it is a type error, not a runtime surprise.
Retention as Forgetting
Kafka's retention period is a form of forgetting. Events older than the retention window are deleted. This is not a limitation - it is a design parameter.
A system that retains everything forever is not a memory system. It is an archive. Memory is selective. It keeps what is relevant and lets the rest decay.
The appropriate retention for blog telemetry is probably 90 days. Long enough to observe patterns across multiple reading sessions, short enough that the storage stays bounded. The consolidation layer - ClickHouse for columnar analytics, OpenSearch for semantic indexing - extracts the signal before the raw events expire.
The Consumer Group as Parallel Processing
Kafka consumer groups allow multiple consumers to read from the same topic simultaneously, each handling a different subset of partitions. This is not just a throughput optimization - it is a model of parallel cognition.
Different consumers can process the same event stream for different purposes: one materializes to ClickHouse, one indexes to OpenSearch, one feeds a real-time attention model. The same episode, observed through multiple lenses simultaneously.
💡On anthropomorphism
Using words like "memory" and "forgetting" for infrastructure is not a claim about consciousness. It is a claim about structure: Kafka logs have the same temporal and selectivity properties as episodic memory systems. The analogy is computationally precise, not poetic.