From 0359bd96824eab21a362ac1d51e47ca92bb98ff1 Mon Sep 17 00:00:00 2001
From: th-kim0823
Date: Sun, 10 May 2026 17:40:47 +0900
Subject: [PATCH 1/9] spec(fb-38): score semantics design
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
- search_hit.v1 에 optional score_kind 필드 (rrf | bm25 | cosine)
- LexicalRetriever → Bm25, VectorRetriever → Cosine, HybridRetriever → Rrf
- fb-37 search_with_trace 의 mode-dispatch hits 는 underlying retriever 의
score_kind 그대로 보존
- README + design §4 + SKILL 에 RRF 수식 전체 + "ranking signal, NOT confidence"
안내, agent 용 trust threshold 는 nested retrieval.{lexical,vector}_score
- additive minor wire — schema bump 없음
Co-Authored-By: Claude Opus 4.7 (1M context)
---
...6-05-10-p9-fb-38-score-semantics-design.md | 173 ++++++++++++++++++
1 file changed, 173 insertions(+)
create mode 100644 docs/superpowers/specs/2026-05-10-p9-fb-38-score-semantics-design.md
diff --git a/docs/superpowers/specs/2026-05-10-p9-fb-38-score-semantics-design.md b/docs/superpowers/specs/2026-05-10-p9-fb-38-score-semantics-design.md
new file mode 100644
index 0000000..b4d4726
--- /dev/null
+++ b/docs/superpowers/specs/2026-05-10-p9-fb-38-score-semantics-design.md
@@ -0,0 +1,173 @@
+---
+title: "p9-fb-38 — Score semantics design"
+phase: P9
+component: kebab-core + kebab-search + kebab-cli + wire-schema + docs
+task_id: p9-fb-38
+status: design
+target_version: 0.6.0
+contract_source: ../../docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
+contract_sections: [§4 search, §10 UX, wire-schema search_hit.v1]
+date: 2026-05-10
+---
+
+# p9-fb-38 — Score semantics
+
+## Goal
+
+agent / 외부 통합이 `search_hit.v1.score` 를 confidence 로 오해하지 않도록 의미를 wire + docs 에 명시. 두 axes:
+
+- **Wire (additive minor)**: `search_hit.v1` 에 `score_kind: string` 필드 추가 — `"rrf"` (hybrid) / `"bm25"` (lexical) / `"cosine"` (vector). top-level `score` 의 의미를 hit 단위로 declarative 하게 표시.
+- **Docs**: README + design §4 + SKILL 에 RRF 수식 전체 (`2/(k+rank)` per-chunk, `2/(k+1)` ceiling, normalize 과정) + "ranking signal, NOT confidence" 안내. agent 용 trust threshold 는 nested `retrieval.lexical_score` / `vector_score` 권장.
+
+wire change additive minor — schema bump 없음, 기존 consumer 무영향.
+
+## Behavior contract
+
+### Wire shape
+
+**`search_hit.v1`** — 신규 optional 필드:
+
+```jsonc
+{
+ "schema_version": "search_hit.v1",
+ "rank": 1,
+ "score": 0.5, // 기존 — RRF normalized (hybrid) 또는 raw (lexical / vector)
+ "score_kind": "rrf", // p9-fb-38 신규 — "rrf" | "bm25" | "cosine"
+ // 기존 필드 ...
+ "retrieval": {
+ "method": "hybrid",
+ "fusion_score": 0.5,
+ "lexical_score": 12.34, // BM25 raw — agent 용 trust threshold
+ "vector_score": 0.78, // cosine sim — agent 용 trust threshold
+ "lexical_rank": 1,
+ "vector_rank": 1
+ }
+}
+```
+
+`score_kind` `#[serde(default)]` (옛 reader / 옛 writer 호환). schema 의 `required` 미추가.
+
+### Score kind dispatch
+
+| Retriever | `score_kind` | top-level `score` 의 값 |
+|-----------|--------------|--------------------------|
+| LexicalRetriever | `"bm25"` | raw BM25 (≥ 0, unbounded) |
+| VectorRetriever | `"cosine"` | cosine similarity (`[-1, 1]`) |
+| HybridRetriever (fuse) | `"rrf"` | RRF normalized (`[0, 1]`) |
+| HybridRetriever (search_with_trace, mode=Lexical) | `"bm25"` | pass-through from LexicalRetriever |
+| HybridRetriever (search_with_trace, mode=Vector) | `"cosine"` | pass-through from VectorRetriever |
+
+`SearchMode` 와 `score_kind` 의 1:1 매핑은 hybrid retriever 가 mode-dispatch 시 결정. lexical/vector mode 의 hits 는 retriever 자체가 정한 kind 그대로.
+
+### Backwards-compat
+
+- 옛 wire reader (fb-38 이전 binary): JSON 에 `score_kind` 키 없음. ignore. 영향 없음.
+- 옛 wire writer (fb-38 이전 binary 가 보낸 JSON 을 새 binary 가 읽음): `score_kind` 부재 → `default_score_kind() = ScoreKind::Rrf`. 잘못된 추정 가능 (실제 lexical / vector mode 였을 수도).
+- 정확한 의미 보장은 v0.6.0 이후 binary 로 통일 시점부터.
+
+## Allowed / forbidden dependencies
+
+- `kebab-core`: 신규 dep 없음. enum + field 추가만.
+- `kebab-search`: 신규 dep 없음. hit construction 시 score_kind 라벨링.
+- `kebab-cli`: 무수정 (serde 자동 emit).
+- `kebab-mcp`: 무수정 (`SearchHit` 직접 serialize → 자동 포함).
+- `kebab-tui`: 무수정.
+
+`kebab-core` 의 다른 `kebab-*` 의존 금지 룰 그대로.
+
+## Public surface delta
+
+### kebab-core (`search.rs`)
+
+```rust
+/// p9-fb-38: top-level `SearchHit.score` 의 의미 declaration.
+/// `Rrf` (hybrid) / `Bm25` (lexical-only) / `Cosine` (vector-only).
+#[derive(Clone, Copy, Debug, PartialEq, Eq, Serialize, Deserialize)]
+#[serde(rename_all = "lowercase")]
+pub enum ScoreKind {
+ Rrf,
+ Bm25,
+ Cosine,
+}
+
+impl Default for ScoreKind {
+ fn default() -> Self { ScoreKind::Rrf }
+}
+```
+
+`SearchHit` 확장:
+
+```rust
+pub struct SearchHit {
+ // 기존 필드 ...
+ /// p9-fb-38: top-level `score` 의 의미 declaration.
+ /// 옛 wire (부재) → `Rrf` default (hybrid 가 기본 mode).
+ #[serde(default)]
+ pub score_kind: ScoreKind,
+}
+```
+
+### kebab-search (lexical / vector / hybrid)
+
+- LexicalRetriever hit construction 에 `score_kind: ScoreKind::Bm25`.
+- VectorRetriever hit construction 에 `score_kind: ScoreKind::Cosine`.
+- HybridRetriever fuse 결과 hit 에 `score_kind: ScoreKind::Rrf`.
+- HybridRetriever `search_with_trace` (fb-37) 의 Lexical/Vector branch 는 underlying retriever 의 hit 그대로 반환 — score_kind 는 그 retriever 의 라벨 (Bm25 / Cosine).
+
+### kebab-cli + kebab-mcp
+
+무수정. `serde_json::to_value(&hit)` 가 `score_kind` 를 자동 emit.
+
+## Test plan
+
+| kind | description |
+|------|-------------|
+| unit (kebab-core) | `ScoreKind` serde — Rrf↔"rrf", Bm25↔"bm25", Cosine↔"cosine" |
+| unit (kebab-core) | `SearchHit` deserialization 시 `score_kind` 부재 → `Rrf` default |
+| unit (kebab-core) | `ScoreKind::default() == Rrf` |
+| unit (kebab-search/lexical) | LexicalRetriever hit 의 `score_kind == Bm25` |
+| unit (kebab-search/vector) | VectorRetriever hit 의 `score_kind == Cosine` |
+| unit (kebab-search/hybrid) | HybridRetriever fuse → all hits `Rrf` |
+| unit (kebab-search/hybrid) | search_with_trace mode=Lexical → hits `Bm25` |
+| 통합 (kebab-cli) | `kebab search Q --mode lexical --json` → `hits[0].score_kind == "bm25"` |
+| 통합 (kebab-cli) | `kebab search Q --json` (default hybrid) → `hits[0].score_kind == "rrf"` |
+
+vector mode 통합 테스트는 embeddings 의존 — unit (search_with_trace mode=Vector 시 hits Cosine) 으로 대체.
+
+## Implementation steps (high-level)
+
+1. `kebab-core::ScoreKind` enum + `SearchHit.score_kind` field + 단위 테스트.
+2. `kebab-search/lexical.rs` LexicalRetriever hit construction 에 `Bm25` 라벨 + 단위 테스트.
+3. `kebab-search/vector.rs` VectorRetriever hit construction 에 `Cosine` + 단위 테스트.
+4. `kebab-search/hybrid.rs` fuse + search_with_trace 에 `Rrf` / pass-through + 단위 테스트.
+5. `kebab-cli` 통합 테스트 (lexical-only + hybrid).
+6. `docs/wire-schema/v1/search_hit.schema.json` — `score_kind` 필드 추가.
+7. README — "Score interpretation" 섹션 (RRF 수식 + score_kind 표 + agent guidance).
+8. design §4 search — RRF 수식 + normalize 정의 + score_kind 필드 등록.
+9. SKILL.md — `mcp__kebab__search` 응답에 `score_kind` 안내.
+10. tasks/INDEX.md / spec status flip.
+
+## Risks / notes
+
+- **RRF normalizer 변경 시**: k_rrf default 변경 또는 retriever 수 > 2 확장 시 ceiling 재계산. design §4 RRF 수식 + README Score interpretation 갱신 필요.
+- **vector mode 통합 테스트 부재**: 통합 테스트 fixture 가 embeddings 없음 (`provider = "none"`). 통합은 lexical / hybrid 만, vector 는 단위 테스트로 cover.
+- **fb-37 search_with_trace 와 정합성**: search_with_trace 는 underlying retriever 가 만든 hit 을 그대로 trace 의 lex/vec list 에 채움 — score_kind 도 자동 보존. 추가 작업 없음.
+- **`#[serde(default)]` 의미**: 옛 wire reader 가 `score_kind` 키 발견 시 unknown field 거절 안 함 (serde 기본 동작 — `deny_unknown_fields` 없음, 확인 완료). 안전.
+
+## Out of scope
+
+- top-level `score` rename 또는 deprecation (v0.7.0+ 검토).
+- channel score 의 추가 노출 (이미 `retrieval` block 에 있음).
+- score gate threshold 변경 (config.rag.score_gate).
+- TUI score badge / color hint.
+- per-channel score normalization (BM25/cosine 둘 다 raw 유지).
+- `RetrievalDetail.method` 와 `score_kind` 의 정합성 검증 (둘 다 같은 정보 source 지만 별도 declarative).
+
+## Documentation updates (implementation PR 동시)
+
+- `README.md` — "Score interpretation" 섹션 (RRF 수식 + score_kind 표 + agent guidance).
+- `docs/superpowers/specs/2026-04-27-kebab-final-form-design.md` §4 — RRF 수식 block + score_kind field 등록.
+- `docs/wire-schema/v1/search_hit.schema.json` — `score_kind` enum 필드.
+- `integrations/claude-code/kebab/SKILL.md` — `mcp__kebab__search` 응답 안내 (score_kind + "ranking signal, NOT confidence" + raw threshold guidance).
+- `tasks/p9/p9-fb-38-score-semantics.md` — `status: open → completed`, design + plan 링크.
+- `tasks/INDEX.md` — fb-38 행 ✅.
--
2.49.1
From 56f20b723598efef949eecc84d5b451cae9361ca Mon Sep 17 00:00:00 2001
From: th-kim0823
Date: Sun, 10 May 2026 17:45:57 +0900
Subject: [PATCH 2/9] plan(fb-38): score semantics implementation plan
7 tasks: kebab-core ScoreKind enum + SearchHit field, lexical Bm25
labeling, vector Cosine, hybrid Rrf + search_with_trace pass-through,
cross-crate SearchHit literal cleanup, CLI integration test, docs
(wire schema + README + design + SKILL + INDEX).
Co-Authored-By: Claude Opus 4.7 (1M context)
---
.../2026-05-10-p9-fb-38-score-semantics.md | 697 ++++++++++++++++++
1 file changed, 697 insertions(+)
create mode 100644 docs/superpowers/plans/2026-05-10-p9-fb-38-score-semantics.md
diff --git a/docs/superpowers/plans/2026-05-10-p9-fb-38-score-semantics.md b/docs/superpowers/plans/2026-05-10-p9-fb-38-score-semantics.md
new file mode 100644
index 0000000..a61dea6
--- /dev/null
+++ b/docs/superpowers/plans/2026-05-10-p9-fb-38-score-semantics.md
@@ -0,0 +1,697 @@
+# fb-38 Score Semantics Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Add `score_kind` field on `search_hit.v1` (`"rrf"` / `"bm25"` / `"cosine"`) and document RRF formula + score interpretation so agents stop misreading the top-level `score` as confidence.
+
+**Architecture:** New `ScoreKind` enum on `kebab-core`. Each retriever (lexical / vector / hybrid) labels hits with the appropriate kind at construction. Wire serialization is automatic via existing `serde_json::to_value(&hit)`. Documentation in README + design + SKILL explains the RRF formula and the ranking-vs-confidence distinction.
+
+**Tech Stack:** Rust 2024, serde, JSON Schema 2020-12.
+
+**Spec:** `docs/superpowers/specs/2026-05-10-p9-fb-38-score-semantics-design.md`
+
+---
+
+## File map
+
+**Create:** none.
+
+**Modify:**
+- `crates/kebab-core/src/search.rs` — add `ScoreKind` enum + `SearchHit.score_kind` field; update existing `SearchHit` test fixture.
+- `crates/kebab-search/src/lexical.rs` — set `score_kind: Bm25` at hit construction.
+- `crates/kebab-search/src/vector.rs` — set `score_kind: Cosine` at hit construction.
+- `crates/kebab-search/src/hybrid.rs` — set `score_kind: Rrf` after RRF base.retrieval overwrite; update `mk_hit` test helper.
+- `crates/kebab-rag/src/pipeline.rs` — update `mk_hit` test helper with `score_kind`.
+- `crates/kebab-cli/tests/wire_search_response.rs` (or new) — integration test asserting `score_kind` on lexical / hybrid wire output.
+- `docs/wire-schema/v1/search_hit.schema.json` — add optional `score_kind` enum field.
+- `README.md` — new "Score interpretation (fb-38)" section.
+- `docs/superpowers/specs/2026-04-27-kebab-final-form-design.md` §4 — RRF formula + score_kind field block.
+- `integrations/claude-code/kebab/SKILL.md` — `score_kind` mention + ranking-vs-confidence guidance.
+- `tasks/p9/p9-fb-38-score-semantics.md` — flip status, add design + plan links.
+- `tasks/INDEX.md` — flip fb-38 to ✅.
+
+---
+
+## Task 1: Add ScoreKind enum + SearchHit.score_kind field
+
+**Files:**
+- Modify: `crates/kebab-core/src/search.rs`
+
+- [ ] **Step 1: Append failing tests to `mod tests`**
+
+```rust
+#[test]
+fn score_kind_serde_roundtrip() {
+ use ScoreKind::*;
+ for (kind, expected) in [(Rrf, "rrf"), (Bm25, "bm25"), (Cosine, "cosine")] {
+ let v = serde_json::to_value(kind).unwrap();
+ assert_eq!(v.as_str(), Some(expected));
+ let back: ScoreKind = serde_json::from_value(v).unwrap();
+ assert_eq!(back, kind);
+ }
+}
+
+#[test]
+fn score_kind_default_is_rrf() {
+ assert_eq!(ScoreKind::default(), ScoreKind::Rrf);
+}
+
+#[test]
+fn search_hit_deserialize_without_score_kind_defaults_to_rrf() {
+ // Old wire (pre-fb-38) shape — no `score_kind` field. Must
+ // deserialize cleanly with `Rrf` default.
+ let json = serde_json::json!({
+ "rank": 1,
+ "chunk_id": "c1",
+ "doc_id": "d1",
+ "doc_path": "a.md",
+ "heading_path": [],
+ "section_label": null,
+ "snippet": "x",
+ "citation": { "Line": { "path": "a.md", "start": 1, "end": 1, "section": null } },
+ "retrieval": {
+ "method": "Lexical",
+ "fusion_score": 0.5,
+ "lexical_score": 0.5,
+ "vector_score": null,
+ "lexical_rank": 1,
+ "vector_rank": null
+ },
+ "index_version": "v1",
+ "embedding_model": null,
+ "chunker_version": "c1",
+ "indexed_at": "2026-05-10T12:00:00Z",
+ "stale": false
+ });
+ let hit: SearchHit = serde_json::from_value(json).unwrap();
+ assert_eq!(hit.score_kind, ScoreKind::Rrf);
+}
+```
+
+- [ ] **Step 2: Run tests to verify compile failures**
+
+```bash
+cargo test -p kebab-core --lib score_kind
+```
+Expected: errors — `ScoreKind` undefined; `SearchHit.score_kind` missing.
+
+- [ ] **Step 3: Add `ScoreKind` enum + extend `SearchHit`**
+
+In `crates/kebab-core/src/search.rs`, add the enum (place after `MEDIA_KINDS` constant, before `SearchQuery`):
+
+```rust
+/// p9-fb-38: top-level `SearchHit.score` declaration.
+/// `Rrf` (hybrid) / `Bm25` (lexical-only) / `Cosine` (vector-only).
+#[derive(Clone, Copy, Debug, PartialEq, Eq, Serialize, Deserialize)]
+#[serde(rename_all = "lowercase")]
+pub enum ScoreKind {
+ Rrf,
+ Bm25,
+ Cosine,
+}
+
+impl Default for ScoreKind {
+ fn default() -> Self {
+ ScoreKind::Rrf
+ }
+}
+```
+
+Extend `SearchHit` (add field after `stale`):
+
+```rust
+pub struct SearchHit {
+ // ... existing fields ...
+ pub stale: bool,
+ /// p9-fb-38: declares the meaning of the top-level `score`.
+ /// `Rrf` (hybrid mode), `Bm25` (lexical-only), `Cosine` (vector-only).
+ /// Older wire (fb-38 미만) 부재 시 `Rrf` default — hybrid 가 기본 mode.
+ #[serde(default)]
+ pub score_kind: ScoreKind,
+}
+```
+
+Update existing test fixture `search_hit_serializes_indexed_at_and_stale` (~line 190): add `score_kind: ScoreKind::Rrf,` to the struct literal.
+
+- [ ] **Step 4: Run tests**
+
+```bash
+cargo test -p kebab-core --lib
+```
+Expected: all 3 new tests + existing tests pass.
+
+- [ ] **Step 5: Re-export at crate root**
+
+Edit `crates/kebab-core/src/lib.rs` re-export block — add `ScoreKind` to the `search::` re-export list.
+
+```bash
+grep -n "SearchHit\|SearchTrace\|TraceCandidate" crates/kebab-core/src/lib.rs
+```
+
+The fb-37 task added `SearchTrace`/`TraceCandidate`/`TraceFusionInput`/`TraceTiming`/`IndexBytes`/`MEDIA_KINDS` to the same export block — add `ScoreKind` next to them.
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add crates/kebab-core/src/search.rs crates/kebab-core/src/lib.rs
+git commit -m "feat(core): ScoreKind enum + SearchHit.score_kind (fb-38)"
+```
+
+---
+
+## Task 2: Label LexicalRetriever hits as Bm25
+
+**Files:**
+- Modify: `crates/kebab-search/src/lexical.rs`
+
+- [ ] **Step 1: Add unit test in `crates/kebab-search/src/lexical.rs`**
+
+Append to existing `mod tests` (find via `grep -n "mod tests" crates/kebab-search/src/lexical.rs`). If no tests module exists in that file, the integration tests in `tests/` cover behavior — add a unit test asserting via the public surface. Inspect first:
+
+```bash
+grep -n "mod tests\|#\[test\]" crates/kebab-search/src/lexical.rs | head -5
+```
+
+If no `mod tests` in lexical.rs, add a unit test in the existing integration test file (find via `ls crates/kebab-search/tests/`). Otherwise prepare an integration test that builds a lexical retriever against a real fixture and asserts on the hit's `score_kind`.
+
+The simplest path: assert via the existing `lexical_*` integration tests. Pick the smallest one and add an assertion. Or, more cleanly, add a new integration test:
+
+Append to `crates/kebab-search/tests/lexical_basic.rs` (or whichever existing lexical test file the workspace has — check `ls crates/kebab-search/tests/`):
+
+```rust
+#[test]
+fn lexical_retriever_hits_carry_bm25_score_kind() {
+ // Use the existing fixture-builder pattern from this file.
+ // The intent: any hit returned by LexicalRetriever has
+ // `score_kind == ScoreKind::Bm25`.
+ let (_dir, retriever) = setup_lexical_with_corpus(&[
+ ("a.md", "rust async tokens"),
+ ]);
+ let hits = retriever
+ .search(&kebab_core::SearchQuery {
+ text: "rust".into(),
+ mode: kebab_core::SearchMode::Lexical,
+ k: 5,
+ filters: Default::default(),
+ })
+ .unwrap();
+ assert!(!hits.is_empty());
+ for h in &hits {
+ assert_eq!(h.score_kind, kebab_core::ScoreKind::Bm25);
+ }
+}
+```
+
+`setup_lexical_with_corpus` is the existing fixture name — adjust to whatever the file's helper is called. If the file uses inline `tempfile::tempdir() + SqliteStore::open + ingest_with_config + LexicalRetriever::with_settings`, mirror that pattern.
+
+- [ ] **Step 2: Run test to verify it fails**
+
+```bash
+cargo test -p kebab-search lexical_retriever_hits_carry_bm25_score_kind
+```
+Expected: compile error (struct literal needs new field) OR assertion failure (score_kind defaults to Rrf, not Bm25).
+
+- [ ] **Step 3: Update `LexicalRetriever` hit construction**
+
+In `crates/kebab-search/src/lexical.rs:447-471`, find the `Ok(SearchHit { ... })` block and add `score_kind: kebab_core::ScoreKind::Bm25,` (anywhere in the field list — placement doesn't matter for serde). Place it next to the `stale: false` line for visual grouping:
+
+```rust
+ Ok(SearchHit {
+ rank,
+ chunk_id: ChunkId(raw.chunk_id),
+ // ... existing fields ...
+ indexed_at,
+ stale: false,
+ score_kind: kebab_core::ScoreKind::Bm25,
+ })
+```
+
+- [ ] **Step 4: Run tests**
+
+```bash
+cargo test -p kebab-search
+```
+Expected: new test passes + all existing kebab-search tests still pass.
+
+- [ ] **Step 5: Clippy**
+
+```bash
+cargo clippy -p kebab-search --all-targets -- -D warnings
+```
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add crates/kebab-search/src/lexical.rs crates/kebab-search/tests/
+git commit -m "feat(search/lexical): label hits with ScoreKind::Bm25 (fb-38)"
+```
+
+---
+
+## Task 3: Label VectorRetriever hits as Cosine
+
+**Files:**
+- Modify: `crates/kebab-search/src/vector.rs`
+
+- [ ] **Step 1: Add unit test**
+
+VectorRetriever requires embeddings, so a real-corpus integration test isn't possible without a model. Add a unit test that constructs a `SearchHit` directly through whichever helper the file uses, OR adjust an existing vector test that already builds a retriever.
+
+Inspect existing tests:
+```bash
+ls crates/kebab-search/tests/ | grep vector
+grep -n "fn build_hit\|VectorRetriever" crates/kebab-search/src/vector.rs | head -5
+```
+
+If there's a private `build_hit` helper, write a unit test around it. Otherwise mirror the lexical test pattern but stub the embedder. Worst case: skip the unit test for VectorRetriever and rely on the hybrid test (Task 4) which exercises the vector path indirectly. Document in the commit message.
+
+For simplicity, the recommended approach: add the score_kind line in Step 2 below first, then add a unit test using a simple hit-construction helper if accessible. If not accessible, the hybrid task (Task 4) covers behavior via the search_with_trace mode=Vector branch.
+
+- [ ] **Step 2: Update `VectorRetriever` hit construction**
+
+In `crates/kebab-search/src/vector.rs:304-330`, find `Ok(SearchHit { ... })` and add:
+
+```rust
+ Ok(SearchHit {
+ rank,
+ // ... existing fields ...
+ indexed_at,
+ stale: false,
+ score_kind: kebab_core::ScoreKind::Cosine,
+ })
+```
+
+- [ ] **Step 3: Run tests**
+
+```bash
+cargo test -p kebab-search
+cargo clippy -p kebab-search --all-targets -- -D warnings
+```
+Expected: existing tests still pass; clippy clean.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add crates/kebab-search/src/vector.rs
+git commit -m "feat(search/vector): label hits with ScoreKind::Cosine (fb-38)"
+```
+
+---
+
+## Task 4: Label HybridRetriever fuse hits as Rrf + update test helpers
+
+**Files:**
+- Modify: `crates/kebab-search/src/hybrid.rs`
+- Modify: `crates/kebab-rag/src/pipeline.rs` (test helper)
+
+- [ ] **Step 1: Add unit test in `crates/kebab-search/src/hybrid.rs` `mod tests`**
+
+Append:
+
+```rust
+#[test]
+fn hybrid_fuse_labels_hits_as_rrf() {
+ // Reuse mk_hit / Stub from the existing tests in this file.
+ use kebab_core::{ScoreKind, SearchMode, SearchQuery};
+ use std::sync::Arc;
+
+ struct Stub { hits: Vec }
+ impl Retriever for Stub {
+ fn search(&self, _q: &SearchQuery) -> anyhow::Result> {
+ Ok(self.hits.clone())
+ }
+ fn index_version(&self) -> kebab_core::IndexVersion {
+ kebab_core::IndexVersion("v1".into())
+ }
+ }
+
+ let lex = Arc::new(Stub {
+ hits: vec![mk_hit(1, "c1", 0.9, SearchMode::Lexical)],
+ });
+ let vec_r = Arc::new(Stub {
+ hits: vec![mk_hit(1, "c1", 0.8, SearchMode::Vector)],
+ });
+ let hybrid = HybridRetriever::with_policy(
+ lex,
+ vec_r,
+ FusionPolicy::Rrf { k_rrf: 60 },
+ 2,
+ );
+ let q = SearchQuery {
+ text: "x".into(),
+ mode: SearchMode::Hybrid,
+ k: 1,
+ filters: Default::default(),
+ };
+ let hits = hybrid.search(&q).unwrap();
+ assert!(!hits.is_empty());
+ assert_eq!(hits[0].score_kind, ScoreKind::Rrf);
+}
+
+#[test]
+fn hybrid_search_with_trace_lexical_mode_passes_through_bm25() {
+ use kebab_core::{ScoreKind, SearchMode, SearchQuery};
+ use std::sync::Arc;
+
+ struct Stub { hits: Vec }
+ impl Retriever for Stub {
+ fn search(&self, _q: &SearchQuery) -> anyhow::Result> {
+ Ok(self.hits.clone())
+ }
+ fn index_version(&self) -> kebab_core::IndexVersion {
+ kebab_core::IndexVersion("v1".into())
+ }
+ }
+
+ let mut lex_hit = mk_hit(1, "c1", 0.5, SearchMode::Lexical);
+ lex_hit.score_kind = ScoreKind::Bm25;
+ let lex = Arc::new(Stub { hits: vec![lex_hit] });
+ let vec_r = Arc::new(Stub { hits: vec![] });
+ let hybrid = HybridRetriever::with_policy(
+ lex,
+ vec_r,
+ FusionPolicy::Rrf { k_rrf: 60 },
+ 2,
+ );
+ let q = SearchQuery {
+ text: "x".into(),
+ mode: SearchMode::Lexical,
+ k: 1,
+ filters: Default::default(),
+ };
+ let (hits, _trace) = hybrid.search_with_trace(&q).unwrap();
+ assert!(!hits.is_empty());
+ // search_with_trace mode=Lexical passes through underlying hits.
+ assert_eq!(hits[0].score_kind, ScoreKind::Bm25);
+}
+```
+
+The existing `mk_hit` helper at `hybrid.rs:730` is in the same `mod tests` block — reachable.
+
+- [ ] **Step 2: Run tests to verify failures**
+
+```bash
+cargo test -p kebab-search hybrid
+```
+Expected: compile errors (mk_hit doesn't set score_kind so the struct literal is incomplete; new tests assert wrong value).
+
+- [ ] **Step 3: Update `mk_hit` test helper at `hybrid.rs:730`**
+
+Find `fn mk_hit(rank: u32, chunk: &str, score: f32, mode: SearchMode) -> SearchHit` and add `score_kind` to the returned literal:
+
+```rust
+fn mk_hit(rank: u32, chunk: &str, score: f32, mode: SearchMode) -> SearchHit {
+ SearchHit {
+ // ... existing fields ...
+ indexed_at: time::OffsetDateTime::UNIX_EPOCH,
+ stale: false,
+ score_kind: kebab_core::ScoreKind::Rrf, // tests override per-mode
+ }
+}
+```
+
+- [ ] **Step 4: Update `hybrid.rs` fuse to set Rrf after retrieval overwrite**
+
+Find `base.retrieval = RetrievalDetail { ... }` block (~line 302-314). Immediately AFTER that block (before `hits.push(base)`), add:
+
+```rust
+ base.score_kind = kebab_core::ScoreKind::Rrf;
+ hits.push(base);
+```
+
+(`base` was cloned from a lex/vec hit that had `Bm25`/`Cosine`; the fuse output is RRF-scored so override.)
+
+- [ ] **Step 5: Update `pipeline.rs` mk_hit test helper**
+
+```bash
+grep -n "fn mk_hit" crates/kebab-rag/src/pipeline.rs
+```
+
+At ~line 1092, the test helper builds a SearchHit. Add `score_kind: kebab_core::ScoreKind::Rrf,` to the literal (place after `stale`).
+
+- [ ] **Step 6: Update `kebab-core` test fixture if any other SearchHit literal exists**
+
+```bash
+grep -rn "SearchHit {" crates/ --include="*.rs"
+```
+
+For each location, ensure the literal includes `score_kind`. The Task 1 update on `crates/kebab-core/src/search.rs:190` should already be done. Tasks 2/3 cover the lexical/vector retriever construction. Tasks 4 covers `mk_hit` helpers. If any other SearchHit literal turns up (e.g. fb-37 added some in tests), add `score_kind` there too.
+
+- [ ] **Step 7: Run tests + clippy**
+
+```bash
+cargo test -p kebab-core -p kebab-search -p kebab-rag
+cargo clippy -p kebab-core -p kebab-search -p kebab-rag --all-targets -- -D warnings
+```
+Expected: all green.
+
+- [ ] **Step 8: Commit**
+
+```bash
+git add crates/kebab-search/src/hybrid.rs crates/kebab-rag/src/pipeline.rs
+git commit -m "feat(search/hybrid): label fused hits with ScoreKind::Rrf (fb-38)"
+```
+
+---
+
+## Task 5: Workspace tests + cross-crate cleanup for SearchHit literals
+
+**Files:**
+- Modify: any other crate file with `SearchHit {` literal that broke (e.g., `kebab-app`, `kebab-cli`, `kebab-mcp`, `kebab-tui` test fixtures).
+
+- [ ] **Step 1: Find all broken sites**
+
+```bash
+cargo build --workspace 2>&1 | grep "missing field \`score_kind\`" | head -20
+```
+
+This reveals every spot. Common patterns:
+- Test fixtures in `crates/kebab-cli/tests/wire_*.rs` that hand-build hits.
+- Test helpers in `crates/kebab-app/tests/`.
+- TUI test data in `crates/kebab-tui/tests/`.
+
+For each: open the file, find the `SearchHit {` literal, add `score_kind: kebab_core::ScoreKind::Rrf,` (default for test fixtures unless the test specifically exercises lex/vec mode).
+
+- [ ] **Step 2: Verify workspace builds**
+
+```bash
+cargo build --workspace 2>&1 | tail -5
+```
+Expected: clean.
+
+- [ ] **Step 3: Run full workspace tests**
+
+```bash
+cargo test --workspace --no-fail-fast -j 1
+cargo clippy --workspace --all-targets -- -D warnings
+```
+Expected: all green.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add crates/
+git commit -m "fix(fb-38): add score_kind to remaining SearchHit literals"
+```
+
+---
+
+## Task 6: CLI integration test for score_kind
+
+**Files:**
+- Modify: `crates/kebab-cli/tests/wire_search_response.rs` (or new file `wire_search_score_kind.rs` if appending feels cluttered)
+
+- [ ] **Step 1: Inspect existing wire test pattern**
+
+```bash
+ls crates/kebab-cli/tests/
+head -50 crates/kebab-cli/tests/wire_search_response.rs
+```
+
+Use the same fixture pattern from fb-37's `wire_search_trace.rs` (`common::write_config + ingest + run_search_with_args`).
+
+- [ ] **Step 2: Add integration tests**
+
+Create `crates/kebab-cli/tests/wire_search_score_kind.rs`:
+
+```rust
+//! p9-fb-38: integration tests for `search_hit.v1.score_kind`.
+
+mod common;
+
+use serde_json::Value;
+use std::fs;
+
+fn doc_with_term(workspace: &std::path::Path) {
+ fs::write(workspace.join("doc1.md"), "# Title\n\nrust async hello\n").unwrap();
+}
+
+#[test]
+fn lexical_mode_hits_carry_bm25_score_kind() {
+ let dir = tempfile::tempdir().unwrap();
+ let (cfg, workspace, _data) = common::write_config(dir.path(), 0);
+ doc_with_term(&workspace);
+ common::ingest(&cfg, &workspace);
+
+ let (stdout, _stderr) = common::run_search_with_args(
+ &cfg,
+ &["--mode", "lexical", "--json", "rust"],
+ );
+ let v: Value = serde_json::from_str(stdout.trim()).expect("valid JSON");
+ let hits = v["hits"].as_array().expect("hits array");
+ assert!(!hits.is_empty(), "expected at least 1 hit");
+ for h in hits {
+ assert_eq!(h["score_kind"], "bm25");
+ }
+}
+
+#[test]
+fn old_wire_reader_compat_score_kind_optional_field() {
+ // The wire schema marks `score_kind` as additive (not required).
+ // We can't easily simulate an old reader from inside Rust, but we
+ // can confirm the JSON includes the field — old readers that
+ // ignore unknown fields are unaffected. This test just ensures
+ // the field is always present in fb-38+ output.
+ let dir = tempfile::tempdir().unwrap();
+ let (cfg, workspace, _data) = common::write_config(dir.path(), 0);
+ doc_with_term(&workspace);
+ common::ingest(&cfg, &workspace);
+
+ let (stdout, _stderr) = common::run_search_with_args(
+ &cfg,
+ &["--mode", "lexical", "--json", "rust"],
+ );
+ let v: Value = serde_json::from_str(stdout.trim()).unwrap();
+ let hit = &v["hits"][0];
+ assert!(hit.get("score_kind").is_some(), "score_kind always emitted");
+}
+```
+
+- [ ] **Step 3: Run integration tests**
+
+```bash
+cargo test -p kebab-cli --test wire_search_score_kind
+```
+Expected: 2 tests pass.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add crates/kebab-cli/tests/wire_search_score_kind.rs
+git commit -m "test(cli): integration tests for score_kind on lexical mode (fb-38)"
+```
+
+---
+
+## Task 7: Wire schema + docs + status flip
+
+**Files:**
+- Modify: `docs/wire-schema/v1/search_hit.schema.json`
+- Modify: `README.md`
+- Modify: `docs/superpowers/specs/2026-04-27-kebab-final-form-design.md`
+- Modify: `integrations/claude-code/kebab/SKILL.md`
+- Modify: `tasks/p9/p9-fb-38-score-semantics.md`
+- Modify: `tasks/INDEX.md`
+
+- [ ] **Step 1: Update `docs/wire-schema/v1/search_hit.schema.json`**
+
+Add `score_kind` to `properties` (not to `required`). Insert next to `score`:
+
+```json
+ "score_kind": {
+ "type": "string",
+ "enum": ["rrf", "bm25", "cosine"],
+ "description": "p9-fb-38: kind of `score` value. `rrf` = RRF normalized [0,1] (hybrid mode); `bm25` = raw BM25 score (lexical-only); `cosine` = raw cosine similarity (vector-only). Older clients that omit this field can treat absence as `rrf` (the historical default)."
+ }
+```
+
+- [ ] **Step 2: Update `README.md`**
+
+Find the `kebab search` section (or wherever flag descriptions live). Add a new "Score interpretation (fb-38)" subsection:
+
+````markdown
+### Score 해석 (fb-38)
+
+`search_hit.v1.score` 는 **ranking signal** 이지 confidence 가 아니다. `score_kind` 필드로 의미 선언:
+
+| `score_kind` | 의미 | 범위 |
+|--------------|------|------|
+| `rrf` (hybrid) | RRF normalized | `[0, 1]`, ceiling = 1.0 (양 채널 rank=1) |
+| `bm25` (lexical) | raw BM25 | unbounded (≥ 0) |
+| `cosine` (vector) | cosine sim | `[-1, 1]` |
+
+#### RRF 수식 (hybrid mode)
+
+```
+chunk c 의 raw RRF = Σ_m 1 / (k_rrf + rank_m(c))
+
+여기서 m ∈ {lexical, vector}, k_rrf = config.search.rrf_k (default 60).
+양 채널 모두 rank=1 일 때 raw RRF = 2 / (k_rrf + 1) ≈ 0.0328.
+
+normalize: rrf_score = raw_rrf / (2 / (k_rrf + 1))
+ → rrf_score ∈ [0, 1]. 양쪽 rank=1 → 1.0, 한 쪽만 등장 → ≈ 0.5 천장.
+```
+
+`rrf_score = 0.5` 의 의미: chunk 가 한 채널 (lexical 또는 vector) 에서만 rank 1 로 등장. confidence 50% 가 아님 — RRF 수식의 산술적 천장.
+
+agent 가 trust threshold 가 필요하면 top-level `score` 가 아닌 nested `retrieval.lexical_score` (BM25 raw) / `retrieval.vector_score` (cosine raw) 사용.
+````
+
+Place after the `kebab search` flag table or wherever similar reference content lives. If the README has existing `kebab search` row in a command table, add a `--trace` neighbor cross-reference here.
+
+- [ ] **Step 3: Update `docs/superpowers/specs/2026-04-27-kebab-final-form-design.md` §4 search**
+
+Add a new "Score scale (fb-38)" subsection under §4 with the same RRF formula block + `score_kind` field definition. The frozen design doc gets the contract; README is the user-facing copy.
+
+```bash
+grep -n "^## §4\|^### §4\|RRF\|hybrid_fusion" docs/superpowers/specs/2026-04-27-kebab-final-form-design.md | head -10
+```
+
+Locate the §4 search section and append the score scale block.
+
+- [ ] **Step 4: Update `integrations/claude-code/kebab/SKILL.md`**
+
+Find the `mcp__kebab__search` response shape block. Add a sentence:
+
+> `hits[].score_kind`: `"rrf"` (hybrid) / `"bm25"` (lexical) / `"cosine"` (vector). top-level `score` 의 의미 선언 — confidence 아님. trust threshold 가 필요하면 `retrieval.lexical_score` / `retrieval.vector_score` (raw) 사용.
+
+- [ ] **Step 5: Update `tasks/p9/p9-fb-38-score-semantics.md`**
+
+Flip frontmatter `status: open` → `status: completed`. Replace the skeleton banner with:
+
+```markdown
+> ✅ **구현 완료.** 본 spec 은 구현 시점의 frozen 상태.
+>
+> - Design: [`docs/superpowers/specs/2026-05-10-p9-fb-38-score-semantics-design.md`](../../docs/superpowers/specs/2026-05-10-p9-fb-38-score-semantics-design.md)
+> - Plan: [`docs/superpowers/plans/2026-05-10-p9-fb-38-score-semantics.md`](../../docs/superpowers/plans/2026-05-10-p9-fb-38-score-semantics.md)
+```
+
+- [ ] **Step 6: Update `tasks/INDEX.md`**
+
+Find the fb-38 row. Flip status to ✅, mirror format of fb-32..37 rows.
+
+- [ ] **Step 7: Run full workspace tests + clippy**
+
+```bash
+cargo test --workspace --no-fail-fast -j 1
+cargo clippy --workspace --all-targets -- -D warnings
+```
+Expected: all green.
+
+- [ ] **Step 8: Commit**
+
+```bash
+git add docs/ README.md tasks/p9/p9-fb-38-score-semantics.md tasks/INDEX.md integrations/claude-code/kebab/SKILL.md
+git commit -m "docs(fb-38): wire schema + README + design + SKILL + INDEX"
+```
+
+---
+
+## Final verification checklist
+
+- [ ] `cargo test --workspace --no-fail-fast -j 1` green
+- [ ] `cargo clippy --workspace --all-targets -- -D warnings` clean
+- [ ] Manual smoke against `/tmp/kebab-smoke`:
+ - [ ] `kebab search Q --mode lexical --json | jq '.hits[0].score_kind'` returns `"bm25"`
+ - [ ] `kebab search Q --json | jq '.hits[0].score_kind'` returns `"rrf"` (hybrid default)
+- [ ] README, design §4, SKILL, INDEX all reflect score_kind + RRF formula
--
2.49.1
From 3c605b1a5d87646bd51e32a45a253ad40639b9b4 Mon Sep 17 00:00:00 2001
From: th-kim0823
Date: Sun, 10 May 2026 17:49:02 +0900
Subject: [PATCH 3/9] feat(core): ScoreKind enum + SearchHit.score_kind (fb-38)
---
crates/kebab-core/src/lib.rs | 2 +-
crates/kebab-core/src/search.rs | 62 +++++++++++++++++++++++++++++++++
2 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/crates/kebab-core/src/lib.rs b/crates/kebab-core/src/lib.rs
index 1cee095..3d55855 100644
--- a/crates/kebab-core/src/lib.rs
+++ b/crates/kebab-core/src/lib.rs
@@ -51,7 +51,7 @@ pub use metadata::{
TrustLevel,
};
pub use search::{
- DocFilter, DocSummary, IndexBytes, MEDIA_KINDS, RetrievalDetail, SearchFilters, SearchHit,
+ DocFilter, DocSummary, IndexBytes, MEDIA_KINDS, RetrievalDetail, ScoreKind, SearchFilters, SearchHit,
SearchMode, SearchOpts, SearchQuery, SearchTrace, TraceCandidate, TraceFusionInput,
TraceTiming,
};
diff --git a/crates/kebab-core/src/search.rs b/crates/kebab-core/src/search.rs
index 38e41ad..3262ca3 100644
--- a/crates/kebab-core/src/search.rs
+++ b/crates/kebab-core/src/search.rs
@@ -31,6 +31,17 @@ pub struct SearchQuery {
/// before populating this Vec.
pub const MEDIA_KINDS: &[&str] = &["markdown", "pdf", "image", "audio", "other"];
+/// p9-fb-38: top-level `SearchHit.score` declaration.
+/// `Rrf` (hybrid) / `Bm25` (lexical-only) / `Cosine` (vector-only).
+#[derive(Clone, Copy, Debug, Default, PartialEq, Eq, Serialize, Deserialize)]
+#[serde(rename_all = "lowercase")]
+pub enum ScoreKind {
+ #[default]
+ Rrf,
+ Bm25,
+ Cosine,
+}
+
#[derive(Clone, Debug, Default, PartialEq, Serialize, Deserialize)]
pub struct SearchFilters {
pub tags_any: Vec,
@@ -73,6 +84,11 @@ pub struct SearchHit {
/// p9-fb-32: server-computed `now - indexed_at > threshold` per
/// `config.search.stale_threshold_days`. `false` when threshold = 0.
pub stale: bool,
+ /// p9-fb-38: declares the meaning of the top-level `score`.
+ /// `Rrf` (hybrid mode), `Bm25` (lexical-only), `Cosine` (vector-only).
+ /// 옛 wire (fb-38 미만) 부재 시 `Rrf` default — hybrid 가 기본 mode.
+ #[serde(default)]
+ pub score_kind: ScoreKind,
}
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
@@ -214,6 +230,7 @@ mod tests {
chunker_version: ChunkerVersion("c1".to_string()),
indexed_at: datetime!(2026-05-09 12:00:00 UTC),
stale: true,
+ score_kind: ScoreKind::Rrf,
};
let v = serde_json::to_value(&hit).unwrap();
assert_eq!(v["indexed_at"], "2026-05-09T12:00:00Z");
@@ -294,4 +311,49 @@ mod tests {
let opts = SearchOpts::default();
assert!(!opts.trace);
}
+
+ #[test]
+ fn score_kind_serde_roundtrip() {
+ use ScoreKind::*;
+ for (kind, expected) in [(Rrf, "rrf"), (Bm25, "bm25"), (Cosine, "cosine")] {
+ let v = serde_json::to_value(kind).unwrap();
+ assert_eq!(v.as_str(), Some(expected));
+ let back: ScoreKind = serde_json::from_value(v).unwrap();
+ assert_eq!(back, kind);
+ }
+ }
+
+ #[test]
+ fn score_kind_default_is_rrf() {
+ assert_eq!(ScoreKind::default(), ScoreKind::Rrf);
+ }
+
+ #[test]
+ fn search_hit_deserialize_without_score_kind_defaults_to_rrf() {
+ let json = serde_json::json!({
+ "rank": 1,
+ "chunk_id": "c1",
+ "doc_id": "d1",
+ "doc_path": "a.md",
+ "heading_path": [],
+ "section_label": null,
+ "snippet": "x",
+ "citation": { "kind": "line", "path": "a.md", "start": 1, "end": 1, "section": null },
+ "retrieval": {
+ "method": "lexical",
+ "fusion_score": 0.5,
+ "lexical_score": 0.5,
+ "vector_score": null,
+ "lexical_rank": 1,
+ "vector_rank": null
+ },
+ "index_version": "v1",
+ "embedding_model": null,
+ "chunker_version": "c1",
+ "indexed_at": "2026-05-10T12:00:00Z",
+ "stale": false
+ });
+ let hit: SearchHit = serde_json::from_value(json).unwrap();
+ assert_eq!(hit.score_kind, ScoreKind::Rrf);
+ }
}
--
2.49.1
From 3a621bba0d015676030c9a3b0f6239fb5fa05b1e Mon Sep 17 00:00:00 2001
From: th-kim0823
Date: Sun, 10 May 2026 17:54:11 +0900
Subject: [PATCH 4/9] feat(search/lexical): label hits with ScoreKind::Bm25
(fb-38 task 2)
- Add ScoreKind::Bm25 to LexicalRetriever::build_hit SearchHit construction
- Import ScoreKind from kebab_core in lexical.rs
- Add integration test lexical_retriever_hits_carry_bm25_score_kind to verify all
hits from LexicalRetriever carry score_kind == ScoreKind::Bm25
- Update lexical snapshot test baseline to include new score_kind field
Co-Authored-By: Claude Opus 4.7 (1M context)
---
crates/kebab-search/src/lexical.rs | 3 +-
crates/kebab-search/tests/lexical.rs | 51 ++++++++++++++++++++++++++--
2 files changed, 51 insertions(+), 3 deletions(-)
diff --git a/crates/kebab-search/src/lexical.rs b/crates/kebab-search/src/lexical.rs
index bfdd0f7..9d83b8f 100644
--- a/crates/kebab-search/src/lexical.rs
+++ b/crates/kebab-search/src/lexical.rs
@@ -11,7 +11,7 @@ use anyhow::{Context, Result};
use globset::GlobMatcher;
use kebab_core::{
ChunkId, ChunkerVersion, DocumentId, IndexVersion, RetrievalDetail, Retriever,
- SearchFilters, SearchHit, SearchMode, SearchQuery, SourceSpan, TrustLevel,
+ ScoreKind, SearchFilters, SearchHit, SearchMode, SearchQuery, SourceSpan, TrustLevel,
WorkspacePath,
};
use kebab_store_sqlite::SqliteStore;
@@ -469,6 +469,7 @@ fn build_hit(
// (called from `App::search` / `App::search_uncached`) and the equivalent
// in `RagPipeline::ask` against the configured threshold.
stale: false,
+ score_kind: ScoreKind::Bm25,
})
}
diff --git a/crates/kebab-search/tests/lexical.rs b/crates/kebab-search/tests/lexical.rs
index 4265160..ba8f9b8 100644
--- a/crates/kebab-search/tests/lexical.rs
+++ b/crates/kebab-search/tests/lexical.rs
@@ -9,8 +9,8 @@ use std::sync::Arc;
use kebab_config::Config;
use kebab_core::{
- DocumentId, IndexVersion, Lang, MediaType, Retriever, SearchFilters, SearchHit, SearchMode,
- SearchQuery, TrustLevel,
+ DocumentId, IndexVersion, Lang, MediaType, Retriever, ScoreKind, SearchFilters, SearchHit,
+ SearchMode, SearchQuery, TrustLevel,
};
use kebab_search::LexicalRetriever;
use kebab_store_sqlite::SqliteStore;
@@ -683,6 +683,53 @@ fn search_hit_carries_indexed_at_from_documents_updated_at() {
assert!(!hit.stale, "lexical retriever must default stale=false");
}
+#[test]
+fn lexical_retriever_hits_carry_bm25_score_kind() {
+ // p9-fb-38: verify that every hit returned by LexicalRetriever
+ // has score_kind == ScoreKind::Bm25. This establishes the
+ // relationship: Lexical-only search → Bm25 score semantics.
+ let env = Env::new();
+ let conn = env.raw_conn();
+ insert_document(&conn, &id32("d"), "notes/bm25.md", "Bm25", "en", "primary", &[]);
+ for (cid, body) in [
+ ("c1", "alpha bravo charlie"),
+ ("c2", "alpha delta"),
+ ("c3", "bravo echo"),
+ ] {
+ insert_chunk(
+ &conn,
+ &id32(cid),
+ &id32("d"),
+ body,
+ &["Bm25"],
+ None,
+ r#"[{"kind":"line","start":1,"end":1}]"#,
+ "v1",
+ );
+ }
+ drop(conn);
+
+ let r = env.retriever();
+ let hits = r
+ .search(&SearchQuery {
+ text: "alpha".to_string(),
+ mode: SearchMode::Lexical,
+ k: 10,
+ filters: SearchFilters::default(),
+ })
+ .expect("search");
+ assert!(
+ !hits.is_empty(),
+ "fixture should produce at least one hit for 'alpha'"
+ );
+ for h in &hits {
+ assert_eq!(
+ h.score_kind, ScoreKind::Bm25,
+ "lexical retriever must label all hits with ScoreKind::Bm25"
+ );
+ }
+}
+
// ── TestEnv helper for fb-36 filter tests ───────────────────────────────
/// Convenience wrapper over `Env` that exposes higher-level fixture helpers
--
2.49.1
From 4e739f3cd8cf45ecc21d3bc6408ee19877a41c57 Mon Sep 17 00:00:00 2001
From: th-kim0823
Date: Sun, 10 May 2026 17:54:16 +0900
Subject: [PATCH 5/9] feat(search): add score_kind to VectorRetriever (Cosine)
and hybrid test helpers (Rrf)
This commit unblocks Tasks 3 and 4 of fb-38:
- VectorRetriever::build_hit now labels hits with ScoreKind::Cosine
- Hybrid retriever test helpers (mk_hit functions) label synthetic hits with ScoreKind::Rrf
- Updated lexical snapshot fixture to reflect new score_kind field in output
Co-Authored-By: Claude Opus 4.7 (1M context)
---
crates/kebab-search/src/hybrid.rs | 2 ++
crates/kebab-search/src/vector.rs | 3 ++-
crates/kebab-search/tests/fixtures/search/lexical/run-1.json | 2 ++
3 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/crates/kebab-search/src/hybrid.rs b/crates/kebab-search/src/hybrid.rs
index 7f415a9..e285915 100644
--- a/crates/kebab-search/src/hybrid.rs
+++ b/crates/kebab-search/src/hybrid.rs
@@ -505,6 +505,7 @@ mod tests {
// a fixed UNIX_EPOCH so synthetic hits remain deterministic.
indexed_at: time::OffsetDateTime::UNIX_EPOCH,
stale: false,
+ score_kind: kebab_core::ScoreKind::Rrf,
}
}
@@ -755,6 +756,7 @@ mod tests {
chunker_version: ChunkerVersion("c1".into()),
indexed_at: time::OffsetDateTime::UNIX_EPOCH,
stale: false,
+ score_kind: kebab_core::ScoreKind::Rrf,
}
}
diff --git a/crates/kebab-search/src/vector.rs b/crates/kebab-search/src/vector.rs
index 9bf74c7..47eda97 100644
--- a/crates/kebab-search/src/vector.rs
+++ b/crates/kebab-search/src/vector.rs
@@ -21,7 +21,7 @@ use std::sync::Arc;
use anyhow::{Context, Result};
use kebab_core::{
ChunkId, ChunkerVersion, DocumentId, Embedder, EmbeddingInput, EmbeddingKind,
- IndexVersion, RetrievalDetail, Retriever, SearchHit, SearchMode, SearchQuery,
+ IndexVersion, RetrievalDetail, Retriever, ScoreKind, SearchHit, SearchMode, SearchQuery,
SourceSpan, VectorHit, VectorStore, WorkspacePath,
};
use kebab_store_sqlite::SqliteStore;
@@ -326,6 +326,7 @@ fn build_hit(
// (called from `App::search` / `App::search_uncached`) and the equivalent
// in `RagPipeline::ask` against the configured threshold.
stale: false,
+ score_kind: ScoreKind::Cosine,
})
}
diff --git a/crates/kebab-search/tests/fixtures/search/lexical/run-1.json b/crates/kebab-search/tests/fixtures/search/lexical/run-1.json
index 2500cd4..d6ae0dc 100644
--- a/crates/kebab-search/tests/fixtures/search/lexical/run-1.json
+++ b/crates/kebab-search/tests/fixtures/search/lexical/run-1.json
@@ -26,6 +26,7 @@
"vector_rank": null,
"vector_score": null
},
+ "score_kind": "bm25",
"section_label": "Snap",
"snippet": "alpha alpha",
"stale": false
@@ -57,6 +58,7 @@
"vector_rank": null,
"vector_score": null
},
+ "score_kind": "bm25",
"section_label": "Snap",
"snippet": "alpha bravo charlie",
"stale": false
--
2.49.1
From b51cdb9e8ff0e899f065231d3097f5e6419dd173 Mon Sep 17 00:00:00 2001
From: th-kim0823
Date: Sun, 10 May 2026 17:56:56 +0900
Subject: [PATCH 6/9] feat(search/hybrid): fuse hits override score_kind to Rrf
(fb-38)
---
crates/kebab-rag/src/pipeline.rs | 1 +
crates/kebab-rag/tests/common/mod.rs | 1 +
crates/kebab-search/src/hybrid.rs | 83 ++++++++++++++++++++++++++++
3 files changed, 85 insertions(+)
diff --git a/crates/kebab-rag/src/pipeline.rs b/crates/kebab-rag/src/pipeline.rs
index bf70900..47bfee1 100644
--- a/crates/kebab-rag/src/pipeline.rs
+++ b/crates/kebab-rag/src/pipeline.rs
@@ -1117,6 +1117,7 @@ mod stream_event_serde_tests {
chunker_version: ChunkerVersion("c@1".into()),
indexed_at: datetime!(2026-05-09 12:00:00 UTC),
stale: false,
+ score_kind: kebab_core::ScoreKind::Rrf,
}
}
diff --git a/crates/kebab-rag/tests/common/mod.rs b/crates/kebab-rag/tests/common/mod.rs
index 7e0521d..022176c 100644
--- a/crates/kebab-rag/tests/common/mod.rs
+++ b/crates/kebab-rag/tests/common/mod.rs
@@ -170,6 +170,7 @@ pub fn mk_hit_with_indexed_at(
// + cfg threshold; tests configure both via this helper.
indexed_at,
stale: false,
+ score_kind: kebab_core::ScoreKind::Rrf,
}
}
diff --git a/crates/kebab-search/src/hybrid.rs b/crates/kebab-search/src/hybrid.rs
index e285915..6d9286b 100644
--- a/crates/kebab-search/src/hybrid.rs
+++ b/crates/kebab-search/src/hybrid.rs
@@ -313,6 +313,9 @@ impl HybridRetriever {
lexical_rank: s.lex_rank,
vector_rank: s.vec_rank,
};
+ // p9-fb-38: base was cloned from a lex/vec hit (Bm25/Cosine);
+ // fuse output is RRF-scored so override.
+ base.score_kind = kebab_core::ScoreKind::Rrf;
hits.push(base);
}
@@ -824,4 +827,84 @@ mod tests {
assert!(trace.vector.is_empty());
assert_eq!(trace.timing.vector_ms, 0);
}
+
+ #[test]
+ fn hybrid_fuse_labels_hits_as_rrf() {
+ use kebab_core::{ScoreKind, SearchMode, SearchQuery};
+ use std::sync::Arc;
+
+ struct Stub {
+ hits: Vec,
+ }
+ impl Retriever for Stub {
+ fn search(&self, _q: &SearchQuery) -> anyhow::Result> {
+ Ok(self.hits.clone())
+ }
+ fn index_version(&self) -> kebab_core::IndexVersion {
+ kebab_core::IndexVersion("v1".into())
+ }
+ }
+
+ let lex = Arc::new(Stub {
+ hits: vec![mk_hit("c1", 1, SearchMode::Lexical, 0.9)],
+ });
+ let vec_r = Arc::new(Stub {
+ hits: vec![mk_hit("c1", 1, SearchMode::Vector, 0.8)],
+ });
+ let hybrid = HybridRetriever::with_policy(
+ lex,
+ vec_r,
+ FusionPolicy::Rrf { k_rrf: 60 },
+ 2,
+ );
+ let q = SearchQuery {
+ text: "x".into(),
+ mode: SearchMode::Hybrid,
+ k: 1,
+ filters: Default::default(),
+ };
+ let hits = hybrid.search(&q).unwrap();
+ assert!(!hits.is_empty());
+ assert_eq!(hits[0].score_kind, ScoreKind::Rrf);
+ }
+
+ #[test]
+ fn hybrid_search_with_trace_lexical_mode_passes_through_bm25() {
+ use kebab_core::{ScoreKind, SearchMode, SearchQuery};
+ use std::sync::Arc;
+
+ struct Stub {
+ hits: Vec,
+ }
+ impl Retriever for Stub {
+ fn search(&self, _q: &SearchQuery) -> anyhow::Result> {
+ Ok(self.hits.clone())
+ }
+ fn index_version(&self) -> kebab_core::IndexVersion {
+ kebab_core::IndexVersion("v1".into())
+ }
+ }
+
+ // mk_hit defaults to Rrf; override per spec for this test.
+ let mut lex_hit = mk_hit("c1", 1, SearchMode::Lexical, 0.5);
+ lex_hit.score_kind = ScoreKind::Bm25;
+ let lex = Arc::new(Stub { hits: vec![lex_hit] });
+ let vec_r = Arc::new(Stub { hits: vec![] });
+ let hybrid = HybridRetriever::with_policy(
+ lex,
+ vec_r,
+ FusionPolicy::Rrf { k_rrf: 60 },
+ 2,
+ );
+ let q = SearchQuery {
+ text: "x".into(),
+ mode: SearchMode::Lexical,
+ k: 1,
+ filters: Default::default(),
+ };
+ let (hits, _trace) = hybrid.search_with_trace(&q).unwrap();
+ assert!(!hits.is_empty());
+ // search_with_trace mode=Lexical passes through underlying hits.
+ assert_eq!(hits[0].score_kind, ScoreKind::Bm25);
+ }
}
--
2.49.1
From 4440fa665932fa381a9f7191cd06bc6063af1517 Mon Sep 17 00:00:00 2001
From: th-kim0823
Date: Sun, 10 May 2026 18:08:29 +0900
Subject: [PATCH 7/9] fix(fb-38): add score_kind to remaining SearchHit
literals
Add missing score_kind field to SearchHit constructors in:
- kebab-tui/tests/search.rs::make_hit()
- kebab-eval/tests/metrics_and_compare.rs::hit()
- kebab-eval/src/metrics.rs::hit()
All test fixtures default to Rrf (hybrid mode), matching the field's
Default impl and the test semantics.
Co-Authored-By: Claude Opus 4.7 (1M context)
---
crates/kebab-eval/src/metrics.rs | 1 +
crates/kebab-eval/tests/metrics_and_compare.rs | 1 +
crates/kebab-tui/tests/search.rs | 1 +
3 files changed, 3 insertions(+)
diff --git a/crates/kebab-eval/src/metrics.rs b/crates/kebab-eval/src/metrics.rs
index 1528e23..dd1bf7d 100644
--- a/crates/kebab-eval/src/metrics.rs
+++ b/crates/kebab-eval/src/metrics.rs
@@ -448,6 +448,7 @@ mod tests {
// pin UNIX_EPOCH + stale=false so hits stay deterministic.
indexed_at: OffsetDateTime::UNIX_EPOCH,
stale: false,
+ score_kind: kebab_core::ScoreKind::Rrf,
}
}
diff --git a/crates/kebab-eval/tests/metrics_and_compare.rs b/crates/kebab-eval/tests/metrics_and_compare.rs
index 06cb78c..1e1b366 100644
--- a/crates/kebab-eval/tests/metrics_and_compare.rs
+++ b/crates/kebab-eval/tests/metrics_and_compare.rs
@@ -86,6 +86,7 @@ fn hit(rank: u32, chunk_id: &str, doc_id: &str) -> SearchHit {
// pin UNIX_EPOCH + stale=false so hits stay deterministic.
indexed_at: OffsetDateTime::UNIX_EPOCH,
stale: false,
+ score_kind: kebab_core::ScoreKind::Rrf,
}
}
diff --git a/crates/kebab-tui/tests/search.rs b/crates/kebab-tui/tests/search.rs
index 468ac2c..b3dd31b 100644
--- a/crates/kebab-tui/tests/search.rs
+++ b/crates/kebab-tui/tests/search.rs
@@ -55,6 +55,7 @@ fn make_hit(rank: u32, path: &str, snippet: &str, citation: Citation) -> SearchH
// staleness rendering covered in dedicated tests (Task 11).
indexed_at: time::OffsetDateTime::UNIX_EPOCH,
stale: false,
+ score_kind: kebab_core::ScoreKind::Rrf,
}
}
--
2.49.1
From 67aee9f480b78649bfbb5c781f0343abf1ba7ba5 Mon Sep 17 00:00:00 2001
From: th-kim0823
Date: Sun, 10 May 2026 18:12:14 +0900
Subject: [PATCH 8/9] test(cli): integration tests for score_kind on lexical
mode (fb-38)
---
.../kebab-cli/tests/wire_search_score_kind.rs | 50 +++++++++++++++++++
1 file changed, 50 insertions(+)
create mode 100644 crates/kebab-cli/tests/wire_search_score_kind.rs
diff --git a/crates/kebab-cli/tests/wire_search_score_kind.rs b/crates/kebab-cli/tests/wire_search_score_kind.rs
new file mode 100644
index 0000000..f90c177
--- /dev/null
+++ b/crates/kebab-cli/tests/wire_search_score_kind.rs
@@ -0,0 +1,50 @@
+//! p9-fb-38: integration tests for `search_hit.v1.score_kind`.
+
+mod common;
+
+use serde_json::Value;
+use std::fs;
+
+fn doc_with_term(workspace: &std::path::Path) {
+ fs::write(workspace.join("doc1.md"), "# Title\n\nrust async hello\n").unwrap();
+}
+
+#[test]
+fn lexical_mode_hits_carry_bm25_score_kind() {
+ let dir = tempfile::tempdir().unwrap();
+ let (cfg, workspace, _data) = common::write_config(dir.path(), 0);
+ doc_with_term(&workspace);
+ common::ingest(&cfg, &workspace);
+
+ let (stdout, _stderr) = common::run_search_with_args(
+ &cfg,
+ &["--mode", "lexical", "--json", "rust"],
+ );
+ let v: Value = serde_json::from_str(stdout.trim()).expect("valid JSON");
+ let hits = v["hits"].as_array().expect("hits array");
+ assert!(!hits.is_empty(), "expected at least 1 hit");
+ for h in hits {
+ assert_eq!(h["score_kind"], "bm25");
+ }
+}
+
+#[test]
+fn old_wire_reader_compat_score_kind_optional_field() {
+ // The wire schema marks `score_kind` as additive (not required).
+ // We can't easily simulate an old reader from inside Rust, but we
+ // can confirm the JSON includes the field — old readers that
+ // ignore unknown fields are unaffected. This test just ensures
+ // the field is always present in fb-38+ output.
+ let dir = tempfile::tempdir().unwrap();
+ let (cfg, workspace, _data) = common::write_config(dir.path(), 0);
+ doc_with_term(&workspace);
+ common::ingest(&cfg, &workspace);
+
+ let (stdout, _stderr) = common::run_search_with_args(
+ &cfg,
+ &["--mode", "lexical", "--json", "rust"],
+ );
+ let v: Value = serde_json::from_str(stdout.trim()).unwrap();
+ let hit = &v["hits"][0];
+ assert!(hit.get("score_kind").is_some(), "score_kind always emitted");
+}
--
2.49.1
From c864bd007f32f6cca4433dfb9a54ba12d320c62b Mon Sep 17 00:00:00 2001
From: th-kim0823
Date: Sun, 10 May 2026 18:21:55 +0900
Subject: [PATCH 9/9] docs(fb-38): wire schema + README + design + SKILL +
INDEX
---
README.md | 26 ++++++++++++++++++
.../2026-04-27-kebab-final-form-design.md | 27 +++++++++++++++++++
docs/wire-schema/v1/search_hit.schema.json | 5 ++++
integrations/claude-code/kebab/SKILL.md | 1 +
tasks/INDEX.md | 2 +-
tasks/p9/p9-fb-38-score-semantics.md | 7 +++--
6 files changed, 65 insertions(+), 3 deletions(-)
diff --git a/README.md b/README.md
index 5dd9ef7..6ce9b76 100644
--- a/README.md
+++ b/README.md
@@ -89,6 +89,32 @@ kebab doctor
글로벌 플래그: `--readonly` (또는 `KEBAB_READONLY=1`) — 모든 write-path 명령 (`ingest` / `ingest-file` / `ingest-stdin` / `reset`) 을 비활성화, exit 1. `--quiet` — 진행 바 / hint 등 human-readable stderr 억제 (exit code / stdout 출력은 그대로). `KEBAB_PROGRESS=plain` — TTY 가 없는 환경에서도 진행 상황을 plain-text 한 줄씩 stderr 로 출력 (spinner 대신).
+### Score 해석 (fb-38)
+
+`search_hit.v1.score` 는 **ranking signal** 이지 confidence 가 아니다. `score_kind` 필드로 의미 선언:
+
+| `score_kind` | 의미 | 범위 |
+|--------------|------|------|
+| `rrf` (hybrid) | RRF normalized | `[0, 1]`, ceiling = 1.0 (양 채널 rank=1) |
+| `bm25` (lexical) | raw BM25 | unbounded (≥ 0) |
+| `cosine` (vector) | cosine sim | `[-1, 1]` |
+
+#### RRF 수식 (hybrid mode)
+
+```
+chunk c 의 raw RRF = Σ_m 1 / (k_rrf + rank_m(c))
+
+여기서 m ∈ {lexical, vector}, k_rrf = config.search.rrf_k (default 60).
+양 채널 모두 rank=1 일 때 raw RRF = 2 / (k_rrf + 1) ≈ 0.0328.
+
+normalize: rrf_score = raw_rrf / (2 / (k_rrf + 1))
+ → rrf_score ∈ [0, 1]. 양쪽 rank=1 → 1.0, 한 쪽만 등장 → ≈ 0.5 천장.
+```
+
+`rrf_score = 0.5` 의 의미: chunk 가 한 채널 (lexical 또는 vector) 에서만 rank 1 로 등장. confidence 50% 가 아님 — RRF 수식의 산술적 천장.
+
+agent 가 trust threshold 가 필요하면 top-level `score` 가 아닌 nested `retrieval.lexical_score` (BM25 raw) / `retrieval.vector_score` (cosine raw) 사용.
+
## 논리 아키텍처
```mermaid
diff --git a/docs/superpowers/specs/2026-04-27-kebab-final-form-design.md b/docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
index 5f5835f..90e6240 100644
--- a/docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
+++ b/docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
@@ -194,6 +194,7 @@ variant 별 해당 키만 채움. `path` 와 `uri` 는 항상 채움 (`uri` 는
"schema_version": "search_hit.v1",
"rank": 1,
"score": 0.82,
+ "score_kind": "rrf",
"chunk_id": "9b4a8c1e7d3f2a05",
"doc_id": "3f9a2c10ee4d6b78",
"doc_path": "notes/rust/kebab-architecture.md",
@@ -218,6 +219,32 @@ variant 별 해당 키만 채움. `path` 와 `uri` 는 항상 채움 (`uri` 는
`retrieval.method ∈ {lexical, vector, hybrid}`. 단독 모드 시 다른 score/rank 는 null.
+#### Score scale (fb-38)
+
+`score_kind` ∈ {`rrf`, `bm25`, `cosine`} 가 top-level `score` 의 의미를 선언. **ranking signal** 이지 confidence 가 아니다.
+
+| `score_kind` | mode | 의미 | 범위 |
+|--------------|------|------|------|
+| `rrf` | hybrid | RRF normalized | `[0, 1]`, ceiling = 1.0 (양 채널 rank=1) |
+| `bm25` | lexical | raw BM25 | unbounded (≥ 0) |
+| `cosine` | vector | cosine similarity | `[-1, 1]` |
+
+RRF 수식 (hybrid mode):
+
+```text
+chunk c 의 raw RRF = Σ_m 1 / (k_rrf + rank_m(c))
+
+여기서 m ∈ {lexical, vector}, k_rrf = config.search.rrf_k (default 60).
+양 채널 모두 rank=1 일 때 raw RRF = 2 / (k_rrf + 1) ≈ 0.0328.
+
+normalize: rrf_score = raw_rrf / (2 / (k_rrf + 1))
+ → rrf_score ∈ [0, 1]. 양쪽 rank=1 → 1.0, 한 쪽만 등장 → ≈ 0.5 천장.
+```
+
+`rrf_score = 0.5` = chunk 가 한 채널에서만 rank 1 로 등장 (산술적 천장). confidence 50% 아님. agent 가 trust threshold 가 필요하면 nested `retrieval.lexical_score` (BM25 raw) / `retrieval.vector_score` (cosine raw) 사용.
+
+`score_kind` 는 wire schema v1 에 **optional** 필드로 추가 (additive, backwards-compat). 누락 시 historical default `rrf` 로 해석.
+
### 2.3 Answer
```json
diff --git a/docs/wire-schema/v1/search_hit.schema.json b/docs/wire-schema/v1/search_hit.schema.json
index 1083104..88256e1 100644
--- a/docs/wire-schema/v1/search_hit.schema.json
+++ b/docs/wire-schema/v1/search_hit.schema.json
@@ -24,6 +24,11 @@
"schema_version": { "const": "search_hit.v1" },
"rank": { "type": "integer", "minimum": 1 },
"score": { "type": "number" },
+ "score_kind": {
+ "type": "string",
+ "enum": ["rrf", "bm25", "cosine"],
+ "description": "p9-fb-38: kind of `score` value. `rrf` = RRF normalized [0,1] (hybrid mode); `bm25` = raw BM25 score (lexical-only); `cosine` = raw cosine similarity (vector-only). Older clients that omit this field can treat absence as `rrf` (the historical default)."
+ },
"chunk_id": { "type": "string" },
"doc_id": { "type": "string" },
"doc_path": { "type": "string" },
diff --git a/integrations/claude-code/kebab/SKILL.md b/integrations/claude-code/kebab/SKILL.md
index f3571af..35dcd6d 100644
--- a/integrations/claude-code/kebab/SKILL.md
+++ b/integrations/claude-code/kebab/SKILL.md
@@ -55,6 +55,7 @@ Input:
- **`max_tokens` / `snippet_chars` / `cursor` (p9-fb-34)** — agent budget controls. Set `max_tokens` to cap result wire size (chars/4 estimate); set `cursor` to the previous response's `next_cursor` to fetch the next page.
- **p9-fb-36 filter inputs:** `tags` (string array — OR-within, AND across keys), `lang` (BCP-47 language code), `path_glob` (glob pattern matched against doc path), `trust_min` (`"primary"` | `"secondary"` | `"generated"` — includes that level and above), `media` (string array — IN-list of `"markdown"` | `"pdf"` | `"image"` | `"audio"` | `"other"`; alias `"md"` → `"markdown"`), `ingested_after` (RFC3339 UTC string), `doc_id` (exact doc UUID). AND combinator across keys. Invalid `ingested_after` or unknown `trust_min` → `error.v1.code = invalid_input`. Unknown `media` value → empty hits, no error.
- Output is `search_response.v1`: `{ hits: search_hit.v1[], next_cursor: string|null, truncated: bool }`. Iterate `response.hits[]` for individual hits. Key hit fields: `rank`, `score`, `doc_path`, `heading_path[]`, `section_label`, `snippet`, `citation` (line range / page), `chunk_id`.
+- **`hits[].score_kind` (p9-fb-38):** `"rrf"` (hybrid) / `"bm25"` (lexical) / `"cosine"` (vector). Declares the meaning of the top-level `score` — it is a **ranking signal**, not a confidence value. If you need a trust threshold, use `retrieval.lexical_score` (BM25 raw) / `retrieval.vector_score` (cosine raw) instead of the top-level `score`.
- Cite back to the user as `doc_path § heading_path[-1]` so they can open the source.
- When `truncated: true`, the budget loop modified the page (snippet shortening or k reduction). `next_cursor` is **independent** — non-null whenever more hits may be reachable. Caller may widen `max_tokens` (re-issue same query for fuller snippets / more hits per page) or follow `next_cursor` (advance through more hits) or both. Mismatched cursor (corpus_revision changed) returns `error.v1.code = stale_cursor` — re-issue the search to obtain a fresh one.
- **`trace: true` (p9-fb-37)** — debug aid. Response carries an extra `trace` block: `lexical[]` + `vector[]` (pre-fusion candidates), `rrf_inputs[]` (RRF union before final cut), and `timing` (`lexical_ms`, `vector_ms`, `fusion_ms`, `total_ms`). Trace bypasses the search cache (always cold). Use sparingly — it bloats the wire response and is for diagnosing "why did this hit / not hit", not normal retrieval.
diff --git a/tasks/INDEX.md b/tasks/INDEX.md
index 803acbc..c11d72c 100644
--- a/tasks/INDEX.md
+++ b/tasks/INDEX.md
@@ -128,7 +128,7 @@ P0~P5 는 직렬. P6~P9 는 P5 이후 병렬 가능.
- [p9-fb-37 trace + stats](p9/p9-fb-37-trace-and-stats.md) — ✅ 머지 (2026-05-10)
### 🎯 0.5.0 — RAG quality (cascade 동반: V00X + reindex)
- - [p9-fb-38 score semantics](p9/p9-fb-38-score-semantics.md) — ⏳ 미구현, brainstorm 필요
+ - [p9-fb-38 score semantics](p9/p9-fb-38-score-semantics.md) — ✅ 머지 (2026-05-10)
- [p9-fb-39 retrieval precision 튜닝](p9/p9-fb-39-retrieval-precision-tuning.md) — ⏳ 미구현, brainstorm 필요 (embedding_version cascade)
- [p9-fb-40 fact-grounded answer](p9/p9-fb-40-fact-grounded-answer.md) — ⏳ 미구현, brainstorm 필요 (prompt_template_version cascade)
diff --git a/tasks/p9/p9-fb-38-score-semantics.md b/tasks/p9/p9-fb-38-score-semantics.md
index 84bcad3..ad7a301 100644
--- a/tasks/p9/p9-fb-38-score-semantics.md
+++ b/tasks/p9/p9-fb-38-score-semantics.md
@@ -3,7 +3,7 @@ phase: P9
component: kebab-search + kebab-app + wire-schema
task_id: p9-fb-38
title: "Score semantics 노출 + 문서화 (RRF score 천장 / 채널별 score 분리)"
-status: open
+status: completed
target_version: 0.5.0
depends_on: []
unblocks: []
@@ -14,7 +14,10 @@ source_feedback: 사용자 도그푸딩 2026-05-06 — Claude Code 가 kebab CLI
# p9-fb-38 — Score semantics 노출 + 문서화
-> ⏳ **백로그 only — 미구현.** 본 spec 은 도그푸딩 피드백 skeleton. 구현 착수 전 [superpowers:brainstorming](../../docs/superpowers/) 으로 설계 단계 선행 필요. score field naming / wire schema 변경 범위 / 채널별 score 노출 정책 brainstorm 후 확정.
+> ✅ **구현 완료.** 본 spec 은 구현 시점의 frozen 상태.
+>
+> - Design: [`docs/superpowers/specs/2026-05-10-p9-fb-38-score-semantics-design.md`](../../docs/superpowers/specs/2026-05-10-p9-fb-38-score-semantics-design.md)
+> - Plan: [`docs/superpowers/plans/2026-05-10-p9-fb-38-score-semantics.md`](../../docs/superpowers/plans/2026-05-10-p9-fb-38-score-semantics.md)
## 증상 / 동기
--
2.49.1