Steve HutchinsonBig Pines
·3 min read·Kafka and the Registry

The Hidden Problem with Raw Kafka Messages

Using Kafka without a schema registry sounds fine in theory - just send JSON. In practice, it becomes a debugging nightmare. How one renamed field broke the ingestion pipeline and why I stopped trusting raw messages.

Using Kafka without a schema registry sounds fine in theory - just send JSON messages, right?

In practice, it quickly becomes a nightmare.

The Problem with Raw Messages

Without any contract between producers and consumers, one small change to a message format can break everything downstream. A field gets renamed. A type changes from string to number. Someone adds a new required field without updating every consumer. Suddenly your ingestion pipeline starts failing - or worse, silently producing wrong results.

What I Experienced

I learned this the hard way. Different parts of the system were producing ExperienceEvents with slightly different structures. This crept in gradually as the system evolved and independent changes that seemed locally reasonable accumulated.

Debugging became painful because I could not trust what shape the data would be in at any given point. Every parsing failure required checking both the consumer code and the producer code, and the answer was often "both are correct for different versions of the schema."

The worst cases were not the outright failures - those at least produced errors I could see. The worst cases were the silent corruptions: events that parsed without error but produced subtly wrong memories because a field had changed its meaning without the consumer being updated. These took days to diagnose.

Here is what the topic registry looks like in a well-governed system. All topic names are declared as typed constants in one place - no raw string literals anywhere in the codebase:

// packages/kafka-bus/src/topics.ts

export const Topics = {
  // Cognitive interaction pipeline (standard Kafka)
  EXPERIENCE_RAW: 'experience.raw',
  EXPERIENCE_ENRICHED: 'experience.enriched',
  MEMORY_INDEXED: 'memory.indexed',
  AGENT_REASONING_REQUEST: 'agent.reasoning.request',
  AGENT_REASONING_RESPONSE: 'agent.reasoning.response',
  CONSOLIDATION_REQUEST: 'consolidation.request',
  MEMORY_SEMANTIC_UPDATED: 'memory.semantic.updated',
  POLICY_EVALUATION: 'policy.evaluation',
  AUDIT_EVENTS: 'audit.events',

  // Telemetry tier: Diskless topics (KIP-1150)
  // High-volume append-only; broker latency of 500ms-5s is acceptable.
  TELEMETRY_METRICS_RAW: 'telemetry.metrics.raw',
  TELEMETRY_LOGS_RAW: 'telemetry.logs.raw',
  TELEMETRY_TRACES_RAW: 'telemetry.traces.raw',

  // Cognition tier: Diskless topics (KIP-1150)
  // Processed operational intelligence; minutes-level latency acceptable.
  COGNITION_PRIMITIVES: 'cognition.primitives',
  COGNITION_PATTERNS: 'cognition.patterns',
  COGNITION_ANOMALIES: 'cognition.anomalies',
  // ...
} as const

export type TopicName = (typeof Topics)[keyof typeof Topics]

This organization - a typed constant registry imported by every producer and consumer - is what makes the schema registry actually enforceable. Without it, you cannot even reliably enumerate which topics exist, let alone validate what messages each one accepts.

The Forensics Problem

Tracking down when and where a breaking change was introduced in a raw message stream is forensic work. You have to compare message samples across time, correlate them with deployment timestamps, manually trace through producer code to find where the change happened, and then figure out which memories in the store were affected.

I wasted days on problems that a formal schema contract would have caught at publish time - before any bad data ever reached the pipeline.

Raw Kafka Messages Give You Speed, Not Safety

Decoupling producers from consumers is one of Kafka's greatest strengths. But that same decoupling is exactly what makes raw uncontracted messages dangerous: each side evolves independently, and there is nothing to catch when they drift apart.

That is what the Schema Registry exists to fix.

Related Articles

This site collects anonymous usage data to understand how people read and navigate the blog. Accepting enables persistent reader preferences across visits.