Vector search

Retrieval based on embedding similarity rather than keyword matching. Finds memories that are semantically related to a query even when they share no exact terms. Implemented in OpenSearch using HNSW indexes for approximate nearest neighbor lookup. The k-NN component of the memory gateway's hybrid retrieval pipeline.

Vector search is what enables the memory gateway to surface contextually relevant memories that do not share keywords with the query. A query about 'latency spikes under load' will surface memories about 'response time degradation during peak traffic' because their embedding vectors are close in the high-dimensional space the model has learned to represent semantic relationships. The core infrastructure is OpenSearch's k-NN plugin, which uses HNSW indexing to make approximate nearest neighbor search fast at scale. The memory substrate maintains multiple embedding lanes simultaneously - quality (1024-dimensional Qwen embeddings), efficient (768-dimensional Nomic embeddings), and hybrid (1024-dimensional BGE-M3 embeddings) - allowing the memory gateway to select the retrieval profile that best matches the current budget constraints. Full semantic similarity requires the quality lane; fast retrieval under budget pressure uses the efficient lane.

Full glossary