PR-2 of fb-41 multi-hop RAG. Decompose + retrieve + synthesize 3-stage
pipeline가 `opts.multi_hop=true` 일 때 dispatch. Dynamic decide loop
는 PR-3.
- `AskOpts.multi_hop: bool` 필드 추가 + `impl Default for AskOpts`
도입 (HOTFIXES 2026-05-07 의 known limitation 해소). 9 explicit
init site 모두 `multi_hop: false` 추가 — Default 도입으로 향후
`..Default::default()` 점진 migrate 가능.
- `RagPipeline::ask` 의 entry 에 dispatcher 한 줄
(`if opts.multi_hop { return self.ask_multi_hop(...) }`).
- `RagPipeline::ask_multi_hop` 신규 method. 1) decompose LLM call
→ JSON array of strings parse, 2) 각 sub-query 로 retrieve +
chunk_id dedup pool, 3) score gate / no-chunks 가드, 4)
pack_context (single-pass 와 helper 공유), 5) synthesize LLM
call w/ MULTI_HOP_SYNTHESIZE_SYSTEM_PROMPT, 6) citation extract
+ Answer build. `prompt_template_version` = "rag-multi-hop-v1"
로 stamp — eval `compare` 가 single-pass vs multi-hop 분리.
- Prompt const 신규: MULTI_HOP_DECOMPOSE_SYSTEM_PROMPT +
MULTI_HOP_DECOMPOSE_USER_TEMPLATE + MULTI_HOP_SYNTHESIZE_SYSTEM_PROMPT
+ PROMPT_TEMPLATE_VERSION_MULTI_HOP + MULTI_HOP_MAX_SUB_QUERIES_DEFAULT.
- `kebab_core::RefusalReason::MultiHopDecomposeFailed` variant 신규.
Cascade: kebab-store-sqlite `refusal_reason_label` + kebab-tui `ask
refusal render` exhaustive match 갱신.
- `parse_decompose_response` + `strip_markdown_json_fence` helper —
markdown code fence (```json / ```) strip + JSON array of strings
parse + trim + drop empty + cap at MULTI_HOP_MAX_SUB_QUERIES_DEFAULT.
None 반환 시 caller 가 `MultiHopDecomposeFailed` refusal.
Tests (55 passing total, 8 신규):
- 6 unit (parse_decompose_response 의 bare array / fence variants /
garbage / cap / trim 회귀 핀).
- 2 integration: `ask_multi_hop_dispatches_and_decompose_garbage_refuses`
(decompose garbage → MultiHopDecomposeFailed + 정확히 1 LLM call) +
`ask_with_multi_hop_false_keeps_single_pass_path` (회귀 핀, 기존
caller 자동 backwards-compat).
Happy-path multi-hop (decompose 성공 → synthesize) 의 integration
test 는 ScriptedLm helper 가 PR-3 의 decide loop 와 함께 도입될
때 같이 추가. 현 `MockLanguageModel` 는 canned single response 라
2-LLM-call sequence 핀 불가.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #159 worker-1 review 의 LOW 가독성 nit 반영 — CLI stderr [hint]
line + --json hint shape 통합 test 가 없었음.
- search_plain_emits_short_query_hint_to_stderr — 빈 KB + 2자 query
→ stderr 가 "[hint]" + "3자 이상" 포함 확인.
- search_json_emits_hint_field_for_short_query — 동일 입력 --json
→ search_response.v1.hint 필드 set + 표준 advisory 문자열 정합.
- search_json_omits_hint_field_when_query_is_long_enough — 3자
query → hint 필드 absent (additive serializer 의 None 제외 동작).
wire_search_response 5 → 8 PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
trigram tokenizer 가 snippet 단위 + 단어 경계 + BM25 raw score 분포를
모두 바꿔서 unicode61 assumption 기반의 3 test 가 regression.
- wire_search_response::search_json_truncates_with_max_tokens +
search_plain_emits_truncated_hint_to_stderr: 단일 doc + 작은
max_tokens 로는 snippet 이 짧아서 budget loop 가 trip 안 함.
다중 doc fixture (5 doc) + budget 30 token 으로 hit-pop 경로
통해 truncated=true 보장.
- fetch_integration::fetch_chunk_with_context_returns_neighbors:
fixture body 의 2-char tokens (A1/A3 등) 가 trigram 비호환으로
0-hit. apples/banana/cherry/durian/elder 5-char unique words
로 갱신, query 도 cherry 로 deterministic pin.
- eval/runner::runner_per_query_snapshot_matches_fixture: trigram
token stream 으로 BM25 raw score 변동. UPDATE_SNAPSHOTS=1 로
regenerate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #148 auto-purges only filesystem-missing files (conservative — leaves
on-disk-but-out-of-scope docs alone for data safety). This is the explicit
complement: when the user has narrowed include / widened exclude / removed
a sub-directory from the workspace and WANTS the stored docs reconciled,
they invoke 'kebab reset --orphans-only'.
Confirm prompt with orphan count + sample paths; --yes required in
non-TTY. SQLite purge via existing purge_deleted_workspace_path (PR #148)
+ vector store delete_by_chunk_ids when configured. No fs existence
check — orphans-only is the explicit 'I know what I'm doing' variant.
dogfood follow-up to PR #148 (file deletion auto-purge).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Files deleted from disk (rm a.md) were leaving stale documents + chunks +
embeddings in the store, surfacing as ghost citations in search/ask.
Existing purge_orphan_at_workspace_path only handled content-changed
stale (WHERE workspace_path=? AND asset_id != ?) — file deletion has no
new asset_id.
Fix: post-walker-scan sweep. Compute (stored_paths - scanned_paths),
for each candidate check filesystem existence — only purge when the
file is TRULY missing. Scope-narrowing case (file on disk but outside
include glob) is explicitly NOT purged to protect users from accidental
data loss via config edits.
Adds:
- DocumentStore::all_workspace_paths trait method + SqliteStore impl
- purge_deleted_workspace_path in store-sqlite (returns chunk_ids for
vector delete; deletes doc CASCADE + asset row + copied storage file)
- sweep_deleted_files in kebab-app::ingest path; called once per ingest
before the per-asset loop
- IngestReport.purged_deleted_files counter (additive, serde default)
- CLI ingest summary mentions purge count when > 0
- 2 integration tests: file_deletion_auto_purge + include_scope_narrowing_does_NOT_purge
dogfood discovery (PR #142 1B + multi-root: kebab-docs + httpx + zod
+ lodash). Per user decision: only filesystem deletion auto-purges;
scope narrowing requires explicit kebab reset.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 16's new code_lang_breakdown / repo_breakdown fields broke the existing schema_wrapper_tags_schema_version test in wire.rs which constructs Stats { ... } literally. Use ..Default::default() since Stats now derives Default.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 13: add wire regression tests proving markdown SearchHit omits
repo/code_lang when None, and all 5 original Citation variants serialize
byte-identically without spurious Code-variant keys.
Task 15: add --repo (repeatable) and --code-lang (repeatable,
comma-separated) flags to `kebab search`; propagate both into
SearchFilters instead of the previous vec![] stub. Add
#[allow(clippy::large_enum_variant)] — Cmd is short-lived, boxing buys
nothing.
Task 16: add code_lang_breakdown and repo_breakdown BTreeMap fields to
Stats (schema.v1); derive Default on Stats; populate both as empty in
collect_stats (1A-2 fills them when code chunks land). Add unit test
asserting both keys are present in the serialized object.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds five new u32 counters (skipped_gitignore, skipped_kebabignore,
skipped_builtin_blacklist, skipped_generated, skipped_size_exceeded)
and a SkipExamples struct (≤5 sample paths per category) to
IngestReport. All new fields are #[serde(default)] for backward-compat
deserialization. Downstream literal construction sites patched with
zeros/empty; snapshot re-baked.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire two new optional fields onto SearchHit (skip_serializing_if = None)
and two Vec<String> filter fields onto SearchFilters (serde default).
Add RetrievalDetail::Default impl (manual, uses SearchMode::Hybrid as
sentinel). Patch all downstream SearchHit / SearchFilters literal
constructors with repo: None / code_lang: None / vec![] as appropriate.
Also covers Citation::Code arm in kebab-eval metrics match.
Also fixes App::search_with_opts trace branch to use NoopRetriever
for SearchMode::Lexical, removing the embeddings requirement when
the user only wants lexical-mode trace.
Adds the `trace: Option<SearchTrace>` field to `SearchResponse` and
threads `SearchOpts.trace` through `App::search_with_opts`. When the
caller sets `opts.trace = true` the path bypasses the LRU search cache
and runs through `HybridRetriever::search_with_trace`, which dispatches
all 3 SearchModes internally; this means `--trace` requires embeddings
(same constraint as `--mode hybrid`). The non-trace path keeps its
exact prior behavior with `trace: None` stamped on the response.
Picked up Task 1 / Task 3 follow-ups in the same commit so the
workspace compiles: SearchOpts struct-literals in kebab-cli/main.rs +
kebab-mcp/tools/search.rs default the new `trace` field to false, and
the schema-wrapper test in kebab-cli/wire.rs fills the new
media_breakdown / lang_breakdown / index_bytes / stale_doc_count fields
on Stats with `Default::default()`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- ingested_after: convert OffsetDateTime to UTC before formatting
so non-Z offsets compare correctly against UTC TEXT storage
(lexical.rs + filters.rs)
- README: --tag is repeatable-only, not csv (only --media is csv)
- test(cli): add multi-value --tag OR-within IN-list coverage
- test(store): add UTC-offset regression test for ingested_after
- mcp: use ERROR_V1_ID const instead of hardcoded "error.v1"
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 new flags: --tag (repeatable), --lang, --path-glob,
--trust-min (value_enum), --media (csv with `md` alias),
--ingested-after (RFC3339; config_invalid on parse fail),
--doc-id. Dispatch translates clap values into SearchFilters
and propagates structured errors through the existing
StructuredError wrapper from fb-34.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cli_mcp_initialize_then_tools_list asserts the exact tools[]
count returned by tools/list. fb-35 added kebab__fetch as the
7th tool — bump the assertion accordingly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JSON output is fetch_result.v1; plain output is human-friendly
labeled sections (chunk: before / target / after; doc/span: full
text + stderr truncated hint).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- error_wire: StructuredError wrapper preserves ErrorV1 through
anyhow → classify pipeline. Adds downcast short-circuit so
cursor::decode's typed code = "stale_cursor" reaches the wire
instead of being string-formatted to code = "generic".
- app: search_with_opts now wraps cursor::decode error in
StructuredError instead of anyhow! string format.
- test: error_wire pins both negative (bare anyhow → not
stale_cursor) AND positive (StructuredError → stale_cursor)
invariants. CLI integration test runs end-to-end and asserts
error.v1.code on stderr.
- app: next_cursor only emitted on full-page (k-pop) path; drop
speculative emit on snippet-only truncation that would point at
a different page than the agent expected.
- cursor: differentiate malformed-base64 / malformed-payload /
revision-mismatch error messages; all keep code = stale_cursor.
- test: cursor_rejected fixture uses .expect() to fail loud on
cursor non-emission instead of silent skip.
- test: max_tokens=0 → 1-hit floor + truncated=true.
- docs: SKILL.md + schema description distinguish snippet-shrink
(widen) vs k-pop (paginate) truncated cases. HOTFIXES notes
--no-cache semantic shift (cached path + clear vs uncached path).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JSON output wrapped in search_response.v1 (breaking — agent must
adapt). Plain output unchanged + [truncated; use --cursor X]
stderr hint when budget tripped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- pipeline: refresh module docstring step 5 to reflect new cancel
semantics (RetrievalDone/Token/Final + LlmStreamAborted)
- wire schema: spell out refusal-path behavior in answer_event.v1
description (only retrieval_done emitted; no final)
- test: factual comment on relax_score_gate-using test corrected
- test: new Ollama-gated stream_score_gate_refusal_emits_only_retrieval_done
- test: new ask_emits_no_final_when_cancelled_mid_stream pinning
the no-Final invariant on cancel
- pipeline: large_enum_variant comment broadened to acknowledge
RetrievalDone.hits as the dominant per-emit cost
- HOTFIXES: log AskOpts.stream_sink internal API break per spec
contract policy
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three Ollama-gated integration tests covering:
- stderr lines parse as answer_event.v1 (retrieval_done first,
final last, all carry RFC3339 ts).
- stdout final line is answer.v1 (backwards compat).
- non-stream path (--json without --stream) unchanged.
- BrokenPipe stderr → child terminates cleanly via cancel
propagation through pipeline SendError.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Background-thread driver runs ask_with_config; main thread
drains the receiver, serializes each StreamEvent to ndjson on
stderr. BrokenPipe → drop receiver → pipeline SendError →
cancel + LlmStreamAborted refusal. Final stdout line is the
existing answer.v1 (ingest_progress.v1 backwards-compat
pattern).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror of Task 9's search-output rendering: yellow [stale] on TTY,
plain text otherwise. JSON path inherits via serde on AnswerCitation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Yellow when TTY, plain when not. JSON path inherits via serde
on the domain type; no CLI-side wire change needed there.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
set_draw_target switching broke cursor positioning: each hidden→stderr
restore caused indicatif to draw a fresh line instead of overwriting.
Root fix: call only set_position() in TTY AssetStarted (one draw per
file). Filename visible in non-TTY plain-line output.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
set_position() and set_message() each call update_and_draw()
independently, producing two scrollback lines per file in TTY mode.
Suppress the draw target before the two updates, restore to stderr,
then call tick() to emit exactly one frame.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AssetStarted now advances position (idx-1) and sets message together.
AssetFinished no longer updates the bar — Completed handles final
cleanup via finish_and_clear. Result: one bar frame per file instead
of two, eliminating the scrollback duplicate-line artifact.
Add readonly/quiet fields to Cli, parse_bool_env for 1/true/yes/on support,
is_mutating guard that short-circuits with error.v1 on write-path commands,
and wire KEBAB_PROGRESS=plain through from_flags in the Ingest arm.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add `quiet: bool` to `Human` variant and expand `from_flags` to three
args (`json`, `quiet`, `plain_env`). Update `handle`/`handle_human`
accordingly; add four targeted unit tests (TDD).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fb-31 added ingest_file + ingest_stdin MCP tools (Task 9) but the
spawn-based smoke test in cli_mcp_smoke.rs still asserted the fb-30
count of 4. Bump to 6 to match the live tools/list response.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>