BM25
A keyword-based relevance ranking algorithm used in the memory gateway's hybrid retrieval pipeline. BM25 scores documents by term frequency and inverse document frequency, complementing k-NN vector similarity to ensure lexically distinctive terms are retrievable even when embedding similarity is low.
BM25 (Best Match 25) is the keyword retrieval component in the memory gateway's hybrid search. Where k-NN finds memories that are semantically similar - capturing meaning, context, and related concepts even when exact words differ - BM25 rewards precise term matches and penalizes ubiquitous terms. The two signals are complementary: a memory that uses exact terminology from the query may score highly on BM25 while being a semantic outlier in embedding space, and vice versa. Combining both with a weighted fusion step (and optionally re-ranking the merged list with a cross-encoder) produces retrieval results that are both semantically relevant and lexically precise. This matters especially for domain-specific terminology, identifiers, and proper nouns that embeddings may not distinguish well.