Composes the existing LexicalRetriever (P2-2) with a new VectorRetriever
wrapper around LanceVectorStore (P3-3) into a single Retriever that
dispatches by SearchMode. For SearchMode::Hybrid, fuses lexical and
vector candidates via Reciprocal Rank Fusion and populates the full
RetrievalDetail per SearchHit so kb search --explain can attribute
scores back to each side.
Public surface (kb-search crate):
- pub struct VectorRetriever — Arc<dyn VectorStore + Send + Sync>,
Arc<dyn Embedder>, Arc<SqliteStore>, IndexVersion at construction.
- pub struct HybridRetriever { lexical, vector, fusion, k }.
- pub enum FusionPolicy { Rrf { k_rrf: u32 } }.
VectorRetriever:
- Embeds query.text as EmbeddingKind::Query before delegating to
VectorStore::search(query_vec, query.k * 2, &query.filters). Over-
fetches by ×2 for filter losses; LanceVectorStore applies the
filters internally so they propagate naturally.
- Hydrates each VectorHit into a full SearchHit by joining on
chunk_id in a single IN-clause batch (no N+1): doc_path,
section_label, chunker_version, source_spans for citation, plus
embedding_model from embedder.model_id().
- Snippet trimmed to config.search.snippet_chars (vector mode lacks
FTS5 highlighting; chunk text prefix is the next-best signal).
- Citation built from the chunk's first source span via the shared
citation_helper module — extracted from lexical.rs so both
retrievers compute citations identically (Byte/empty fallback to
Line{1,1} preserved with tracing::warn).
- RetrievalDetail.method = Vector for standalone calls; both
fusion_score and vector_score set to the LanceVectorStore-shifted
cosine score; lexical_* None.
HybridRetriever:
- Lexical / Vector modes delegate 1:1 — no rebuild of RetrievalDetail.
- Hybrid mode runs both retrievers with k * 2 fanout, fuses with
RRF (score(c) = Σ 1/(k_rrf + rank_m(c))), sorts fused-score DESC
with deterministic tiebreaker (lex_rank ASC then chunk_id ASC),
takes top query.k. Fusion math runs in f64 throughout; cast to
f32 only at the SearchHit boundary where bounded magnitude (≤
~0.033 at k_rrf=60) makes f32 precision sufficient for ranking.
- Per-hit lexical preferred for snippet/citation/heading_path/
chunker_version/embedding_model when the chunk appears in both
retrievers — FTS5 highlighting is more user-relevant than vector's
truncated text. Vector-only chunks fall through to vector hit data.
- index_version returns format!("hybrid:{}+{}", lex_iv, vec_iv) at
construction; mismatched lex/vec versions trigger a tracing::warn
so users notice stale indexes (spec line 143).
kb-search additions:
- citation_helper.rs — pub(crate) citation_from_first_span shared
between lexical and vector retrievers. Extracted from lexical.rs;
no behavior drift.
Tests (38 default + 3 ignored):
- 12 unit tests in hybrid.rs covering RRF math (1/61 + 1/62 within
f32 epsilon × 10 tolerance), lexical/vector mode delegation, hybrid
preserves single-side hits with the missing side's RetrievalDetail
None, deterministic tiebreaker on identical fused scores, composite
index_version, mismatched-version warn at construction.
- 2 unit tests in vector.rs covering the snippet-prefix and citation
fallback paths.
- 11 unit tests in lexical.rs (unchanged from P2-2).
- 13 lexical integration tests (unchanged).
- 3 #[ignore] AVX-gated hybrid integration tests: disjoint-corpus
recall (lex returns A,B; vec returns C,D; hybrid returns all 4),
determinism over two queries, snapshot stability against
tests/fixtures/search/hybrid/run-1.json. Snapshot fixture was
regenerated against this branch on an AVX-enabled VM and contains
4 real chunks (c1/c2 lex+vec, c3/c4 vec-only).
- KB_UPDATE_SNAPSHOTS=1 path now panics after writing instead of
silently passing — matches the P3-2/P3-3 fail-loud-instead-of-
silent-pass philosophy.
Allowed deps respected (kb-core, kb-config, kb-store-sqlite,
kb-store-vector, kb-embed, tracing, thiserror) plus pre-existing
kb-search deps from P2-2 (rusqlite, globset, serde_json, anyhow).
kb-embed-local does NOT appear — VectorRetriever takes Arc<dyn Embedder>
trait object; the concrete adapter is runtime-injected by kb-app.
Out of scope: reranker (P+), score calibration across modes (RRF is
rank-comparable so absolute calibration is P+), multimodal retrieval
(P6+).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>