Steve HutchinsonBig Pines
·3 min read·Kafka and the Registry

What Is a Schema Registry and Why You Need One

A Schema Registry is a central repository for the structure of every message type flowing through Kafka. It enforces contracts, enables safe schema evolution, and turns your event stream into living documentation.

A Schema Registry is essentially a central repository that stores the structure (schema) of every type of message flowing through Kafka.

Instead of sending raw JSON where the shape is implicit and undocumented, producers register a formal schema that defines exactly what fields a message should contain and what types they are. Every published message then includes a reference to its schema version rather than repeating the full structure.

What the Registry Provides

Contract enforcement - consumers retrieve the schema and know with certainty what shape the incoming data will have. There is no guessing, no defensive parsing, no "check whether this field exists before reading it."

Compatibility guarantees - the registry validates proposed schema changes against configured compatibility rules before allowing them to be registered. A change that would break existing consumers is rejected at registration time, before it reaches production.

Living documentation - the registry is always the authoritative current description of your event contracts. It maintains full version history, so you can see exactly how each message type has evolved over time.

The ExperienceEvent is the canonical example - the contract that every producer of cognitive events must honor:

// packages/core-types/src/experience.ts
//
// Every perception, action, and observed outcome is captured as an
// ExperienceEvent before being routed into the memory pipeline.

export type EventType =
  | 'user_input'
  | 'tool_result'
  | 'system_event'
  | 'agent_action'
  | 'environmental_observation'
  | 'consolidation_output'

export interface EventContext {
  readonly sessionId: string
  readonly userId?: string
  readonly goalId?: string
  readonly policyVersion?: string
  readonly agentId?: string
  readonly traceId?: string
}

export interface ExperienceEvent {
  readonly eventId: string
  readonly timestamp: string
  readonly type: EventType
  readonly context: EventContext
  readonly input: EventInput // { text, embedding, structured? }
  readonly internalState?: InternalState
  readonly action?: EventAction
  readonly result?: EventResult
  readonly evaluation?: EventEvaluation
  readonly importanceScore: number
  readonly tags: ReadonlyArray<string>
}

This TypeScript interface is the source of truth for the ExperienceEvent schema. A Schema Registry formalizes this contract at the Kafka layer - so that any producer changing this shape must go through a compatibility check before the change reaches any consumer.

How It Integrates

The registry integrates directly with Kafka producer and consumer clients. When a producer publishes a message, the client validates the message against the registered schema and embeds the schema ID in the message header. If validation fails, the message is rejected before it is published - not discovered hours later in a consumer error log.

On the consumer side, the client reads the schema ID from the message header, fetches the schema definition (cached after the first retrieval), and uses it to deserialize the message. This happens transparently at the client library level.

The Impact

This single component dramatically increased the stability of the Cognitive Substrate pipeline. When something breaks now, it is almost never due to unexpected message formats - because the registry prevents those problems from reaching production in the first place.

It also changed how I approach development. Schema changes now go through a review process, because the registry makes the contract explicit and visible. That visibility creates a kind of discipline that was impossible with raw unstructured messages.

Related Articles

This site collects anonymous usage data to understand how people read and navigate the blog. Accepting enables persistent reader preferences across visits.