fix(rag): fb-41 PR-7 multi-hop pre-decompose score-gate (S7 hallucination 회귀 핀) #174

altair823 · 2026-05-25T12:02:47Z

altair823 commented

2026-05-25 12:02:47 +00:00

요약

v0.18.0 cut 전 fb-41 multi-hop RAG 도그푸딩에서 발견된 safety regression fix. kebab ask --multi-hop 의 score-gate 우회로 KB 밖 query 가 hallucination 답변 받는 path 차단.

자세한 도그푸딩 결과: tasks/HOTFIXES.md 의 2026-05-25 fb-41 pre-v0.18 entry + /build/cache/dogfood-v018/results/SUMMARY.md (도그푸딩 보고서).

설계: docs/superpowers/specs/2026-05-25-p9-fb-41-multi-hop-rag-design.md
계획: docs/superpowers/plans/2026-05-25-p9-fb-41-multi-hop-rag.md (post-cut hotfix)

문제 (도그푸딩 S7)

Query: What is the chemical formula of caffeine? (KB 에 없는 fact).

Single-pass kebab ask: retrieve top score < rag.score_gate (0.30) → refuse_score_gate → 안전.
Multi-hop kebab ask --multi-hop: grounded = true, 본문 "카페인의 화학식은 C₉H₁₅N₃O 입니다 [#6]" (hallucination — 실제 C₈H₁₀N₄O₂) + [#6] 가 Adam optimizer chunk 의 g_t = ∂L/∂θ_i 본문을 잘못 인용.

원인: ask_multi_hop 의 score-gate 검사가 pool 의 top_score 만 봤음. multi-hop pool 은 5 sub-queries 의 union — 한 sub-query 의 top score 가 gate 위면 다른 chunks 가 원본 query 와 무관해도 통과 → synth → hallucinate.

Fix

ask_multi_hop entry 에 pre-decompose probe:

원본 query 로 retrieve 한 번 (LLM call 0회, ~ms).
probe empty → refuse_no_chunks(None) (decompose 안 함).
probe top_score < gate → refuse_score_gate(None) (decompose 안 함).
probe pass → 기존 decompose / decide / synthesize flow 그대로.

Multi-hop 의 safety floor 가 single-pass 와 정확히 일치. multi-hop 은 원본 query 가 이미 KB 범위 내 일 때만 cross-doc reasoning 추가.

Tests

신규 3 회귀 핀 (crates/kebab-rag/tests/multi_hop.rs):

multi_hop_below_probe_gate_refuses_before_any_llm_call — S7 직접 회귀 핀. low-score chunk + empty LM script → fix revert 시 즉시 panic (ScriptedLm exhaustion).
multi_hop_empty_probe_pool_refuses_before_any_llm_call — empty retrieve → NoChunks, LM calls 0회.
multi_hop_above_probe_gate_proceeds_to_decompose — probe pass 시 full flow 정상.

기존 7 multi-hop test 의 ScriptedRetriever 에 probe entry prepend + retriever_handle.calls() +1. 두 refuse-trace test (*_preserves_hops_trace) 의 의미 좁힘 — decompose-driven refusal 만 검증, probe-driven 은 신규 test 가 핀.

변경 없음

Wire schema (Answer.hops / refusal_reason enum) 동일.
다른 도그푸딩 발견 (synthesize citation 일관성 / latency / binary path) — v0.18.1 또는 별 PR (HOTFIXES 의 "다른 도그푸딩 발견" 절 명시).

검증

cargo test -p kebab-rag -j 1 — 10 multi-hop (7 갱신 + 3 신규) + 19 pipeline + 31 unit + 3 prompt_template + 3 streaming 모두 통과. 회귀 없음.
cargo clippy -p kebab-rag --all-targets -j 1 -- -D warnings clean.

시험 항목 (Test Plan)

S7 직접 회귀 핀 (low-score chunk + empty LM → score_gate refusal, LM calls 0회)
empty probe → NoChunks refusal, LM calls 0회
probe pass 시 full multi-hop flow 정상
기존 7 multi-hop test 갱신 (probe entry + calls expectation)
refuse-trace test 의 의미 명시화 (decompose-driven only)
HOTFIXES dated entry (도그푸딩 발견 + fix + 다른 발견)
clippy clean

다음 — v0.18.0 cut

PR-7 머지 후:

Workspace Cargo.toml version 0.17.2 → 0.18.0 (minor bump — surface 확장 + new prompt_template_version = \"rag-multi-hop-v1\" + safety fix).
HANDOFF.md 한 줄 요약 + 머지 후 결정 절에 fb-41 entry.
INDEX.md 의 fb-41 status open → completed.
frozen design §3.8 multi-hop sub-section (optional — release notes 가 cover 가능).
gitea-release v0.18.0 --auto-notes.

Assisted-by: Claude Code

## 요약 v0.18.0 cut 전 fb-41 multi-hop RAG **도그푸딩에서 발견된 safety regression** fix. `kebab ask --multi-hop` 의 score-gate 우회로 KB 밖 query 가 hallucination 답변 받는 path 차단. 자세한 도그푸딩 결과: `tasks/HOTFIXES.md` 의 2026-05-25 fb-41 pre-v0.18 entry + `/build/cache/dogfood-v018/results/SUMMARY.md` (도그푸딩 보고서). 설계: docs/superpowers/specs/2026-05-25-p9-fb-41-multi-hop-rag-design.md 계획: docs/superpowers/plans/2026-05-25-p9-fb-41-multi-hop-rag.md (post-cut hotfix) ## 문제 (도그푸딩 S7) Query: `What is the chemical formula of caffeine?` (KB 에 없는 fact). - Single-pass `kebab ask`: retrieve top score < `rag.score_gate` (0.30) → `refuse_score_gate` → 안전. - Multi-hop `kebab ask --multi-hop`: **`grounded = true`**, 본문 \"카페인의 화학식은 C₉H₁₅N₃O 입니다 [#6]\" (hallucination — 실제 C₈H₁₀N₄O₂) + `[#6]` 가 Adam optimizer chunk 의 `g_t = ∂L/∂θ_i` 본문을 잘못 인용. 원인: `ask_multi_hop` 의 score-gate 검사가 *pool 의 top_score* 만 봤음. multi-hop pool 은 5 sub-queries 의 union — 한 sub-query 의 top score 가 gate 위면 다른 chunks 가 원본 query 와 무관해도 통과 → synth → hallucinate. ## Fix `ask_multi_hop` entry 에 **pre-decompose probe**: 1. 원본 query 로 retrieve 한 번 (LLM call 0회, ~ms). 2. probe empty → `refuse_no_chunks(None)` (decompose 안 함). 3. probe top_score < gate → `refuse_score_gate(None)` (decompose 안 함). 4. probe pass → 기존 decompose / decide / synthesize flow 그대로. Multi-hop 의 safety floor 가 single-pass 와 *정확히* 일치. multi-hop 은 원본 query 가 이미 KB 범위 내 일 때만 cross-doc reasoning 추가. ## Tests 신규 3 회귀 핀 (`crates/kebab-rag/tests/multi_hop.rs`): - **`multi_hop_below_probe_gate_refuses_before_any_llm_call`** — S7 직접 회귀 핀. low-score chunk + empty LM script → fix revert 시 즉시 panic (ScriptedLm exhaustion). - `multi_hop_empty_probe_pool_refuses_before_any_llm_call` — empty retrieve → NoChunks, LM calls 0회. - `multi_hop_above_probe_gate_proceeds_to_decompose` — probe pass 시 full flow 정상. 기존 7 multi-hop test 의 `ScriptedRetriever` 에 probe entry prepend + `retriever_handle.calls()` +1. 두 refuse-trace test (`*_preserves_hops_trace`) 의 의미 좁힘 — decompose-driven refusal 만 검증, probe-driven 은 신규 test 가 핀. ## 변경 없음 - Wire schema (`Answer.hops` / `refusal_reason` enum) 동일. - 다른 도그푸딩 발견 (synthesize citation 일관성 / latency / binary path) — v0.18.1 또는 별 PR (HOTFIXES 의 \"다른 도그푸딩 발견\" 절 명시). ## 검증 - `cargo test -p kebab-rag -j 1` — 10 multi-hop (7 갱신 + 3 신규) + 19 pipeline + 31 unit + 3 prompt_template + 3 streaming 모두 통과. 회귀 없음. - `cargo clippy -p kebab-rag --all-targets -j 1 -- -D warnings` clean. ## 시험 항목 (Test Plan) - [x] S7 직접 회귀 핀 (low-score chunk + empty LM → score_gate refusal, LM calls 0회) - [x] empty probe → NoChunks refusal, LM calls 0회 - [x] probe pass 시 full multi-hop flow 정상 - [x] 기존 7 multi-hop test 갱신 (probe entry + calls expectation) - [x] refuse-trace test 의 의미 명시화 (decompose-driven only) - [x] HOTFIXES dated entry (도그푸딩 발견 + fix + 다른 발견) - [x] clippy clean ## 다음 — v0.18.0 cut PR-7 머지 후: 1. Workspace `Cargo.toml` version 0.17.2 → 0.18.0 (minor bump — surface 확장 + new `prompt_template_version = \"rag-multi-hop-v1\"` + safety fix). 2. HANDOFF.md 한 줄 요약 + 머지 후 결정 절에 fb-41 entry. 3. INDEX.md 의 fb-41 status `open` → `completed`. 4. frozen design §3.8 multi-hop sub-section (optional — release notes 가 cover 가능). 5. `gitea-release v0.18.0 --auto-notes`. Assisted-by: Claude Code

altair823 added 1 commit 2026-05-25 12:02:49 +00:00

fix(rag): fb-41 PR-7 — multi-hop pre-decompose score-gate (S7 hallucination 회귀 핀) da25ce330b

v0.18 cut 전 fb-41 multi-hop RAG 도그푸딩에서 발견된 **safety regression**
fix. 자세한 도그푸딩 결과는 `tasks/HOTFIXES.md` 의 2026-05-25 fb-41
pre-v0.18 entry + `/build/cache/dogfood-v018/results/SUMMARY.md` 참조.

## 문제 (S7)

Query: `What is the chemical formula of caffeine?` (KB 에 없는 fact).

- Single-pass `kebab ask`: retrieve top score 가 default `rag.score_gate
  = 0.30` 미만 → `refuse_score_gate` → 안전한 refusal.
- Multi-hop `kebab ask --multi-hop`: **`grounded = true`**, 본문
  `"카페인의 화학식은 C₉H₁₅N₃O 입니다 [#6]"` (hallucination — 실제
  C₈H₁₀N₄O₂) + `[#6]` 가 Adam optimizer chunk 의 `g_t = ∂L/∂θ_i` 본문을
  인용 (시각적 short structured token 매칭 trigger).

원인: `ask_multi_hop` 의 score-gate 검사가 *pool 의 top_score* 만 봤다.
multi-hop 의 pool 은 5 sub-queries 의 union — 한 sub-query 의 top score
가 gate 위면 다른 chunks 가 원본 query 와 무관해도 gate 통과 + synth →
LLM hallucinate.

## Fix

`ask_multi_hop` entry 에 **pre-decompose probe** 추가:

1. *원본 query* 로 retrieve 한 번 (LLM call 0회, ~ms).
2. probe empty → `refuse_no_chunks(None)` (decompose 안 함, hops=None).
3. probe top_score < gate → `refuse_score_gate(None)` (decompose 안 함).
4. probe pass → 기존 decompose / decide / synthesize flow 그대로.

Multi-hop 의 safety floor 가 single-pass 와 정확히 일치 — multi-hop 은
*원본 query 가 이미 KB 범위 내* 일 때만 cross-doc reasoning 추가.

비용: 한 번의 retrieve (수 ms), LLM call 없음. multi-hop 의 LLM-dominated
latency 대비 무시 가능.

## Tests

신규 3 회귀 핀 (`crates/kebab-rag/tests/multi_hop.rs`):

- `multi_hop_below_probe_gate_refuses_before_any_llm_call` — **S7 직접
  회귀 핀**. low-score chunk + empty LM script → score_gate refusal, LM
  calls 0회, hops=None. fix revert 시 즉시 panic.
- `multi_hop_empty_probe_pool_refuses_before_any_llm_call` — empty
  retrieve 시 NoChunks refusal, LM calls 0회.
- `multi_hop_above_probe_gate_proceeds_to_decompose` — probe pass 시
  full multi-hop flow 정상 (decompose + decide + synth).

기존 7 multi-hop test 의 `ScriptedRetriever` 에 *probe-pass entry*
prepend + `retriever_handle.calls()` expectation +1. test 2 / test 4
처럼 entry 두 개였던 곳도 prepend (3 entries).

`multi_hop_refuse_no_chunks_preserves_hops_trace` /
`multi_hop_refuse_score_gate_preserves_hops_trace` 의 의미 좁힘 — 이제
*decompose-driven* refusal (probe pass 후 sub-query retrieve 가 empty
또는 below-gate) 만 검증. *probe-driven* refusal 은 hops=None
(decompose 안 함) — 신규 test 가 그 path 핀.

## 검증

- `cargo test -p kebab-rag -j 1` — 10 multi-hop (7 갱신 + 3 신규) + 19
  pipeline + 31 unit + 3 prompt_template + 3 streaming 모두 통과. 회귀
  없음.
- `cargo clippy -p kebab-rag --all-targets -j 1 -- -D warnings` clean.
- 단일 crate 직렬 build (16 GB RAM 제약).

## 변경 없음

- Wire schema — `Answer.hops` shape 동일, `refusal_reason` enum 동일.
- 다른 도그푸딩 발견 (synthesize citation 일관성, latency, binary path
  confusion) — v0.18.1 또는 별 PR 의 책임. HOTFIXES 의 "다른 도그푸딩
  발견" 절에 명시.

## 다음

PR-7 머지 후:
1. Workspace `Cargo.toml` version 0.17.2 → 0.18.0 (minor bump).
2. HANDOFF.md / INDEX.md 갱신 + frozen design §3.8 multi-hop sub-section.
3. `gitea-release v0.18.0 --auto-notes`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude-reviewer-01 requested changes 2026-05-25 12:04:28 +00:00

Dismissed

claude-reviewer-01 left a comment

회차 1 — code doc 1 건 nit (informational, not blocking).

PR-7 의 fix scope (pre-decompose probe + gate, refuse 시 hops=None, 3 신규 + 7 갱신 test, HOTFIXES dated entry) 자체는 매우 깨끗. dogfood S7 의 hallucination 직접 회귀 핀 강력 — fix revert 시 ScriptedLm exhaustion 으로 즉시 panic. test 6/7 의 의미 좁힘도 doc 으로 명확화 ("probe-driven refusal 은 hops=None, decompose-driven 만 hops 보존"). HOTFIXES entry 가 dogfood 결과 + 다른 P1 항목까지 cover.

본 유일 nit: probe_hits 가 gate 검사 후 throw away 되는 의도 (pool 초기값 으로 활용 안 함) 가 코드 reader 에게 명시되면 더 깨끗. inline 코멘트에 doc 추가 제안.

설계·계획 정합 / clippy / 신규 test 의 coverage / 기존 test 갱신 모두 정합. v0.18.0 cut blocker fix 라 회차 2 doc 반영 후 빠른 머지 가치.

회차 1 — code doc 1 건 nit (informational, not blocking). PR-7 의 fix scope (pre-decompose probe + gate, refuse 시 hops=None, 3 신규 + 7 갱신 test, HOTFIXES dated entry) 자체는 매우 깨끗. dogfood S7 의 hallucination 직접 회귀 핀 강력 — fix revert 시 ScriptedLm exhaustion 으로 즉시 panic. test 6/7 의 의미 좁힘도 doc 으로 명확화 ("probe-driven refusal 은 hops=None, decompose-driven 만 hops 보존"). HOTFIXES entry 가 dogfood 결과 + 다른 P1 항목까지 cover. 본 유일 nit: probe_hits 가 gate 검사 후 *throw away* 되는 의도 (pool 초기값 으로 활용 안 함) 가 코드 reader 에게 명시되면 더 깨끗. inline 코멘트에 doc 추가 제안. 설계·계획 정합 / clippy / 신규 test 의 coverage / 기존 test 갱신 모두 정합. v0.18.0 cut blocker fix 라 회차 2 doc 반영 후 빠른 머지 가치.

crates/kebab-rag/src/pipeline.rs

						
				@@ -622,0 +635,4 @@

				        // above the gate.

				        //

				        // Fix: probe the original query exactly the way single-pass

				        // `ask` would, before any decompose / decide LLM call. If

claude-reviewer-01 commented

2026-05-25 12:04:28 +00:00

doc nit (actionable): probe 의 retrieve 결과 (probe_hits) 가 gate 검사만 하고 throw away 됨 — pool 초기값으로 사용 안 함. 이는 의도된 단순화 (회귀 위험 회피) 지만 코드 reader 가 "왜 같은 query 의 retrieve 가 두 번? (probe + decompose 의 첫 sub-query 가 보통 원본 query)" 의문 가질 수 있다.

doc 한 줄 추가 권장:

// Note: probe_hits are inspected for the gate decision only;
// the decompose-driven pool below builds from scratch. We could
// re-use probe_hits as the pool's initial seed (saving one
// retrieve when the decompose's first sub-query == original
// query), but that would change `context_chunks_added` semantics
// on the first HopRecord (currently "chunks from decompose-driven
// retrieve" — would become "probe + decompose"). Kept dropped
// for invariant clarity; revisit if the retrieve cost becomes
// the multi-hop bottleneck.

또는 한 줄 짧게: // probe_hits are inspected for the gate decision only — the decompose-driven pool below builds from scratch, even if the first sub-query == original query.

**doc nit (actionable)**: probe 의 retrieve 결과 (`probe_hits`) 가 gate 검사만 하고 *throw away* 됨 — pool 초기값으로 사용 안 함. 이는 의도된 단순화 (회귀 위험 회피) 지만 코드 reader 가 "왜 같은 query 의 retrieve 가 두 번? (probe + decompose 의 첫 sub-query 가 보통 원본 query)" 의문 가질 수 있다. doc 한 줄 추가 권장: ```rust // Note: probe_hits are inspected for the gate decision only; // the decompose-driven pool below builds from scratch. We could // re-use probe_hits as the pool's initial seed (saving one // retrieve when the decompose's first sub-query == original // query), but that would change `context_chunks_added` semantics // on the first HopRecord (currently "chunks from decompose-driven // retrieve" — would become "probe + decompose"). Kept dropped // for invariant clarity; revisit if the retrieve cost becomes // the multi-hop bottleneck. ``` 또는 한 줄 짧게: `// probe_hits are inspected for the gate decision only — the decompose-driven pool below builds from scratch, even if the first sub-query == original query.`

altair823 referenced this issue from a commit

2026-05-25 12:04:57 +00:00

chore(rag): PR #174 회차 1 리뷰 반영

altair823 added 1 commit 2026-05-25 12:04:58 +00:00

chore(rag): PR #174 회차 1 리뷰 반영 85855ef596

`ask_multi_hop` 의 probe_hits 가 gate 검사 후 throw away 되는 의도
명시 — pool 초기값으로 재사용 안 하는 *invariant clarity* rationale 을
코드 안에 doc. 향후 retrieve cost 가 multi-hop bottleneck 이 될 경우
재검토 hint 도 함께.

검증
- `cargo test -p kebab-rag -j 1 --test multi_hop` 10 모두 통과.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude-reviewer-01 approved these changes 2026-05-25 12:05:12 +00:00

claude-reviewer-01 left a comment

회차 2 — probe_hits 의 invariant clarity doc 완벽 반영. "향후 retrieve cost 가 multi-hop bottleneck 이 될 경우 재검토 hint" 까지 포함 — reader 가 향후 효율화 decision 의 cost / benefit 명확히 weigh 가능.

추가 actionable 항목 없음 — 머지에 동의.

요약:

fb-41 multi-hop 의 score-gate 우회 (도그푸딩 S7, hallucinated chemistry formula + misattributed citation) 가 원본 query 기반 pre-decompose probe 로 차단. multi-hop 의 safety floor 가 single-pass 와 정확히 일치.
신규 3 회귀 핀 (multi_hop_below_probe_gate_refuses_before_any_llm_call 가 S7 직접 핀) + 기존 7 multi-hop test 의 probe entry cascade + 두 refuse-trace test 의 의미 좁힘 (decompose-driven only).
HOTFIXES dated entry — dogfood 결과 + fix + 다른 P1 발견 (synthesize citation 일관성 / latency / binary path) 까지 cover.
cargo test -p kebab-rag -j 1 모든 test 통과 + clippy clean. wire schema 변경 없음.

v0.18.0 cut blocker 해소 — 머지 후 cut step 진행 가능.

회차 2 — probe_hits 의 invariant clarity doc 완벽 반영. \"향후 retrieve cost 가 multi-hop bottleneck 이 될 경우 재검토 hint\" 까지 포함 — reader 가 향후 효율화 decision 의 cost / benefit 명확히 weigh 가능. 추가 actionable 항목 없음 — 머지에 동의. 요약: - fb-41 multi-hop 의 score-gate 우회 (도그푸딩 S7, hallucinated chemistry formula + misattributed citation) 가 *원본 query* 기반 pre-decompose probe 로 차단. multi-hop 의 safety floor 가 single-pass 와 정확히 일치. - 신규 3 회귀 핀 (`multi_hop_below_probe_gate_refuses_before_any_llm_call` 가 S7 직접 핀) + 기존 7 multi-hop test 의 probe entry cascade + 두 refuse-trace test 의 의미 좁힘 (decompose-driven only). - HOTFIXES dated entry — dogfood 결과 + fix + 다른 P1 발견 (synthesize citation 일관성 / latency / binary path) 까지 cover. - `cargo test -p kebab-rag -j 1` 모든 test 통과 + clippy clean. wire schema 변경 없음. v0.18.0 cut blocker 해소 — 머지 후 cut step 진행 가능.

altair823 merged commit 71fb2cbcb3 into main

2026-05-25 12:05:25 +00:00

altair823 deleted branch feat/fb-41-pr-7-multi-hop-score-gate-fix

2026-05-25 12:05:26 +00:00

altair823 referenced this issue from a commit

2026-05-25 12:05:27 +00:00

Merge pull request 'fix(rag): fb-41 PR-7 multi-hop pre-decompose score-gate (S7 hallucination 회귀 핀)' (#174) from feat/fb-41-pr-7-multi-hop-score-gate-fix into main

Sign in to join this conversation.

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: altair823-org/kebab#174