Files
kebab/tasks/p3/p3-4-hybrid-fusion.md
altair823 f9714aa5cb docs(rename): kb → kebab — README, tasks/, docs/, design doc, report
마지막 commit. 모든 .md 안의 `kb` 단어 일괄 갱신.

- 19 개 crate 이름 (`kb-core`, `kb-app`, …) → `kebab-*` (Rust 모듈
  path 표기 `kb_*` → `kebab_*` 포함).
- 미래 component (`kb-tui`, `kb-desktop`, `kb-asr-whisper`, `kb-ocr`,
  `kb-mcp`, `kb-vlm`, `kb-rerank`, `kb-vision-ocr`, `kb-index`,
  `kb-smoke`, `kb-architecture`) → `kebab-*` (P6+ 가 시작될 때
  같은 prefix 사용).
- CLI 명령 예제: `kb ingest` / `kb search` / `kb ask` / `kb init` /
  `kb doctor` / `kb inspect` / `kb list` / `kb eval` →
  `kebab <verb>`. fenced code block + 인라인 backtick 모두.
- XDG paths + env vars + binary 경로 (`target/release/kb` →
  `target/release/kebab`) 동기화.
- design doc / 최초 보고서 / SMOKE / HOTFIXES / phase epic / task
  spec 모든 reference 통일.
- task-decomposition.md 의 `git -c user.name=kb` 는 과거 git history
  기록용 author 정보라 그대로 유지 (실제 git history 의 author 는
  변경 불가).
- `tasks/phase-5-evaluation.md` 의 `status: planned` →
  `completed` 도 같이 (P5-1 + P5-2 PR 머지 후 미반영분).

## 검증

- `grep -rEn "\bkb-[a-z]|\bkb_[a-z]|\.config/kb\b|kb\.sqlite|\bKB_[A-Z]"
   --include="*.md"` 0 hits (task-decomposition.md 의 git author
  제외).
- 모든 file path reference 살아있음 (renamed file 들 모두 새 path
  로 update).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 04:01:55 +00:00

147 lines
7.2 KiB
Markdown

---
phase: P3
component: kebab-search (hybrid)
task_id: p3-4
title: "Hybrid Retriever (RRF) over lexical + vector"
status: completed
depends_on: [p2-2, p3-3]
unblocks: [p4-3]
contract_source: ../../docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
contract_sections: [§3.7 RetrievalDetail, §0 Q3, §1.6 search --explain, §6.4 [search] rrf settings]
---
# p3-4 — Hybrid Retriever (RRF)
## Goal
Compose `LexicalRetriever` (p2-2) and a vector retriever wrapper around `LanceVectorStore` (p3-3) into a single `Retriever` that dispatches by `SearchMode`. For `Hybrid`, fuse via Reciprocal Rank Fusion (RRF) and populate full `RetrievalDetail` per `SearchHit`.
## Why now / why this size
Single mediator. Keeps the lexical and vector retrievers focused; only this task knows how to fuse. RAG (p4-3) consumes hybrid output without caring about the underlying retrievers.
## Allowed dependencies
- `kebab-core`
- `kebab-config`
- `kebab-store-sqlite` (for `LexicalRetriever`)
- `kebab-store-vector` (for `LanceVectorStore`)
- `kebab-embed` (trait only — for query embedding via `Embedder`)
- `tracing`
- `thiserror`
## Forbidden dependencies
- `kebab-source-fs`, `kebab-parse-md`, `kebab-normalize`, `kebab-chunk`, `kebab-llm*`, `kebab-rag`, `kebab-tui`, `kebab-desktop`. (`kebab-embed-local` is a runtime-injected `dyn Embedder`; this crate must not depend on the concrete adapter directly.)
## Inputs
| input | type | source |
|-------|------|--------|
| `LexicalRetriever` | trait object | constructed elsewhere |
| `LanceVectorStore` | trait object | constructed elsewhere |
| `Box<dyn Embedder>` | for query embedding | runtime-injected |
| `kebab-config::Config.search` | `default_k`, `hybrid_fusion`, `rrf_k` | runtime |
| `SearchQuery` | `kebab_core::SearchQuery` | `kebab-app::search` |
## Outputs
| output | type | downstream |
|--------|------|------------|
| `Vec<SearchHit>` (with full `RetrievalDetail`) | `kebab_core::SearchHit` | `kebab-cli` printer, `kebab-rag` packer |
## Public surface (signatures only — no new types)
```rust
pub struct HybridRetriever {
lexical: std::sync::Arc<dyn kebab_core::Retriever>,
vector: std::sync::Arc<dyn kebab_core::Retriever>, // wrapper over LanceVectorStore + Embedder
fusion: FusionPolicy,
k: usize,
}
pub enum FusionPolicy { Rrf { k_rrf: u32 } }
impl HybridRetriever {
pub fn new(
config: &kebab_config::Config,
lexical: std::sync::Arc<dyn kebab_core::Retriever>,
vector: std::sync::Arc<dyn kebab_core::Retriever>,
) -> Self;
}
impl kebab_core::Retriever for HybridRetriever {
fn search(&self, query: &kebab_core::SearchQuery) -> anyhow::Result<Vec<kebab_core::SearchHit>>;
fn index_version(&self) -> kebab_core::IndexVersion;
}
/// Wrapper that turns a VectorStore + Embedder into a Retriever.
pub struct VectorRetriever {
store: std::sync::Arc<dyn kebab_core::VectorStore>,
embed: std::sync::Arc<dyn kebab_core::Embedder>,
/* heading_path/snippet enrichment hits SQLite via kebab-store-sqlite read accessor */
}
impl VectorRetriever {
pub fn new(store: std::sync::Arc<dyn kebab_core::VectorStore>, embed: std::sync::Arc<dyn kebab_core::Embedder>, sqlite: std::sync::Arc<kebab_store_sqlite::SqliteStore>) -> Self;
}
impl kebab_core::Retriever for VectorRetriever { /* per §7.2 */ }
```
## Behavior contract
- `SearchMode::Lexical` dispatches solely to `lexical`. `RetrievalDetail.method = Lexical`, `vector_*` fields are `None`.
- `SearchMode::Vector` dispatches solely to `vector`. `RetrievalDetail.method = Vector`, `lexical_*` fields are `None`.
- `SearchMode::Hybrid`:
- run `lexical.search(query)` and `vector.search(query)` in sequence (fan-out is fine; not required).
- fuse with RRF: `raw(c) = Σ_{m ∈ {lex, vec}} 1 / (k_rrf + rank_m(c))` where `k_rrf` from config (default 60). `rank_m` is 1-based; chunks not appearing in retriever `m` contribute 0.
- **normalize fusion_score to [0, 1]** (post-merge fix, 2026-05): divide by `num_retrievers / (k_rrf + 1)` so the top-1-everywhere case maps to `1.0` and single-retriever chunks cap around `0.5`. Without this, raw RRF tops out at `≈ 0.033` and is incomparable with the `[0, 1]` lexical / vector `fusion_score` (and incompatible with the `config.rag.score_gate` default `0.05` — every hybrid query refused). RRF's rank ordering is preserved (we divide every score by the same positive constant). See [HOTFIXES.md](../HOTFIXES.md).
- sort by fused score DESC, take top `query.k`.
- populate every `SearchHit.retrieval`: `method = Hybrid`, `lexical_score` / `lexical_rank` / `vector_score` / `vector_rank` from each retriever's hit (or `None` if absent), `fusion_score` = normalized fused score.
- if a chunk appears in only one retriever, its `RetrievalDetail` still gets populated with `Some(...)` from that side and `None` for the other.
- tie-break by `lexical_rank` ascending, then `chunk_id` ascending (deterministic).
- `VectorRetriever`:
- embeds the query via `embed.embed(&[EmbeddingInput { text: query.text, kind: Query }])`.
- calls `VectorStore::search(query_vec, query.k * 2, query.filters)` (over-fetch for filter losses), trims to `k`.
- hydrates `doc_path` / `heading_path` / `section_label` / `chunker_version` / `embedding_model` from SQLite by joining on `chunk_id`.
- builds `Citation` from chunk's first source span (same logic as p2-2).
- `index_version()` returns the lexical index version when in pure lexical mode, else the vector index version, else "hybrid:<lex_iv>+<vec_iv>".
## Storage / wire effects
- Reads only. No mutations.
- Output JSON conforms to `search_hit.v1`.
## Test plan
| kind | description | fixture / data |
|------|-------------|----------------|
| unit | pure lexical mode delegates 1:1 to `lexical.search` | mock retrievers |
| unit | pure vector mode delegates 1:1 to `vector.search` | mock retrievers |
| unit | hybrid: chunk only in lexical receives `vector_*: None`, but still has a fused score | mock retrievers |
| unit | RRF formula matches expected with `k_rrf=60` | inline math test |
| unit | tie-break deterministic (same fused score → stable order) | inline |
| unit | hybrid recall ≥ max(lexical recall, vector recall) on a tiny corpus where each mode finds disjoint hits | tmp DB + Lance + MockEmbedder |
| determinism | identical query twice → byte-identical `Vec<SearchHit>` | tmp DB |
| snapshot | hybrid output JSON stable | `fixtures/search/hybrid/run-1.json` |
All tests under `cargo test -p kebab-search hybrid`.
## Definition of Done
- [ ] `cargo check -p kebab-search` passes
- [ ] `cargo test -p kebab-search hybrid` passes
- [ ] No imports outside Allowed dependencies
- [ ] PR links design §3.7, §6.4 search, §0 Q3
## Out of scope
- Reranker (P+).
- Multimodal retrieval (image/audio) — P6+.
- Score calibration across modes (RRF makes scores rank-comparable; absolute calibration is P+).
## Risks / notes
- Mismatched `index_version` between lexical and vector should be flagged at construction so users notice stale indexes.
- Over-fetching at the vector retriever (`2 * k`) is conservative; if filters reject everything, the hybrid `k` may shrink. Document this in CLI `--explain`.
- RRF is rank-based, so absolute lexical bm25 normalization (p2-2) doesn't affect fused order; still keep normalization for `--explain` readability.