Commit Graph

65 Commits

Author SHA1 Message Date
th-kim0823
126559ce7a fix(fb-40): update test fixtures for rag-v2 default 2026-05-10 19:15:15 +09:00
th-kim0823
67aee9f480 test(cli): integration tests for score_kind on lexical mode (fb-38) 2026-05-10 18:12:14 +09:00
th-kim0823
a40593590b docs(fb-37): wire schema + README + SMOKE + INDEX + SKILL 2026-05-10 14:13:47 +09:00
th-kim0823
f7e2072d66 test(cli): integration tests for --trace + schema breakdowns (fb-37)
Also fixes App::search_with_opts trace branch to use NoopRetriever
for SearchMode::Lexical, removing the embeddings requirement when
the user only wants lexical-mode trace.
2026-05-10 13:21:33 +09:00
th-kim0823
72c227af23 feat(cli): kebab search --trace flag + wire trace + pretty print (fb-37)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 13:08:48 +09:00
th-kim0823
69037c313a feat(app): SearchResponse.trace + opts.trace threading (fb-37)
Adds the `trace: Option<SearchTrace>` field to `SearchResponse` and
threads `SearchOpts.trace` through `App::search_with_opts`. When the
caller sets `opts.trace = true` the path bypasses the LRU search cache
and runs through `HybridRetriever::search_with_trace`, which dispatches
all 3 SearchModes internally; this means `--trace` requires embeddings
(same constraint as `--mode hybrid`). The non-trace path keeps its
exact prior behavior with `trace: None` stamped on the response.

Picked up Task 1 / Task 3 follow-ups in the same commit so the
workspace compiles: SearchOpts struct-literals in kebab-cli/main.rs +
kebab-mcp/tools/search.rs default the new `trace` field to false, and
the schema-wrapper test in kebab-cli/wire.rs fills the new
media_breakdown / lang_breakdown / index_bytes / stale_doc_count fields
on Stats with `Default::default()`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 13:01:18 +09:00
th-kim0823
84287d0ef6 fix(fb-36): address PR #127 round 1 review
- ingested_after: convert OffsetDateTime to UTC before formatting
  so non-Z offsets compare correctly against UTC TEXT storage
  (lexical.rs + filters.rs)
- README: --tag is repeatable-only, not csv (only --media is csv)
- test(cli): add multi-value --tag OR-within IN-list coverage
- test(store): add UTC-offset regression test for ingested_after
- mcp: use ERROR_V1_ID const instead of hardcoded "error.v1"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 04:47:55 +09:00
th-kim0823
4e0379c04f test(cli): wire_search_filters — lexical-only integration tests (fb-36)
Cover: --doc-id scoping, --ingested-after validation error,
--media md alias, --tag repeatable + frontmatter parsing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 04:06:21 +09:00
th-kim0823
6a18847892 feat(cli): kebab search filter flags (fb-36)
7 new flags: --tag (repeatable), --lang, --path-glob,
--trust-min (value_enum), --media (csv with `md` alias),
--ingested-after (RFC3339; config_invalid on parse fail),
--doc-id. Dispatch translates clap values into SearchFilters
and propagates structured errors through the existing
StructuredError wrapper from fb-34.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 03:57:55 +09:00
th-kim0823
8d8f1c0294 test(cli): bump expected MCP tool count 6 → 7 for fb-35 fetch
cli_mcp_initialize_then_tools_list asserts the exact tools[]
count returned by tools/list. fb-35 added kebab__fetch as the
7th tool — bump the assertion accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:20:59 +09:00
th-kim0823
beb40249a3 test(cli): wire_fetch — chunk/doc + chunk_not_found integration (fb-35)
3 lexical-only integration tests: chunk JSON shape, doc truncated
with --max-tokens, unknown chunk_id returns error.v1 with
code = chunk_not_found.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:06:14 +09:00
th-kim0823
0fffd69071 feat(cli): kebab fetch chunk / doc / span (fb-35)
JSON output is fetch_result.v1; plain output is human-friendly
labeled sections (chunk: before / target / after; doc/span: full
text + stderr truncated hint).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:01:56 +09:00
th-kim0823
f485608108 fix(fb-34): address PR #125 round 1 review
- error_wire: StructuredError wrapper preserves ErrorV1 through
  anyhow → classify pipeline. Adds downcast short-circuit so
  cursor::decode's typed code = "stale_cursor" reaches the wire
  instead of being string-formatted to code = "generic".
- app: search_with_opts now wraps cursor::decode error in
  StructuredError instead of anyhow! string format.
- test: error_wire pins both negative (bare anyhow → not
  stale_cursor) AND positive (StructuredError → stale_cursor)
  invariants. CLI integration test runs end-to-end and asserts
  error.v1.code on stderr.
- app: next_cursor only emitted on full-page (k-pop) path; drop
  speculative emit on snippet-only truncation that would point at
  a different page than the agent expected.
- cursor: differentiate malformed-base64 / malformed-payload /
  revision-mismatch error messages; all keep code = stale_cursor.
- test: cursor_rejected fixture uses .expect() to fail loud on
  cursor non-emission instead of silent skip.
- test: max_tokens=0 → 1-hit floor + truncated=true.
- docs: SKILL.md + schema description distinguish snippet-shrink
  (widen) vs k-pop (paginate) truncated cases. HOTFIXES notes
  --no-cache semantic shift (cached path + clear vs uncached path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:49:27 +09:00
th-kim0823
603061fb86 test(cli): wire_search_response + budget integration (fb-34)
4 lexical-only tests covering search_response.v1 wrapper shape,
--max-tokens truncation, --cursor pagination, plain stderr hint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:09:01 +09:00
th-kim0823
21220f6d39 feat(cli): kebab search --max-tokens / --snippet-chars / --cursor (fb-34)
JSON output wrapped in search_response.v1 (breaking — agent must
adapt). Plain output unchanged + [truncated; use --cursor X]
stderr hint when budget tripped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:02:50 +09:00
th-kim0823
a082b78f8e fix(fb-33): address PR #124 round 1 review
- pipeline: refresh module docstring step 5 to reflect new cancel
  semantics (RetrievalDone/Token/Final + LlmStreamAborted)
- wire schema: spell out refusal-path behavior in answer_event.v1
  description (only retrieval_done emitted; no final)
- test: factual comment on relax_score_gate-using test corrected
- test: new Ollama-gated stream_score_gate_refusal_emits_only_retrieval_done
- test: new ask_emits_no_final_when_cancelled_mid_stream pinning
  the no-Final invariant on cancel
- pipeline: large_enum_variant comment broadened to acknowledge
  RetrievalDone.hits as the dominant per-emit cost
- HOTFIXES: log AskOpts.stream_sink internal API break per spec
  contract policy

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 15:46:04 +09:00
th-kim0823
39bf0de949 test(cli): wire_ask_stream — stderr ndjson + stdout final + BrokenPipe cancel (fb-33)
Three Ollama-gated integration tests covering:
- stderr lines parse as answer_event.v1 (retrieval_done first,
  final last, all carry RFC3339 ts).
- stdout final line is answer.v1 (backwards compat).
- non-stream path (--json without --stream) unchanged.
- BrokenPipe stderr → child terminates cleanly via cancel
  propagation through pipeline SendError.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 15:14:00 +09:00
th-kim0823
29629e6786 feat(cli): kebab ask --stream emits ndjson on stderr (fb-33)
Background-thread driver runs ask_with_config; main thread
drains the receiver, serializes each StreamEvent to ndjson on
stderr. BrokenPipe → drop receiver → pipeline SendError →
cancel + LlmStreamAborted refusal. Final stdout line is the
existing answer.v1 (ingest_progress.v1 backwards-compat
pattern).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 15:03:41 +09:00
th-kim0823
efc6b7ebb0 fix(fb-32): address PR #122 round 1 review
- config: rename env-silent-ignore test + add file-load negative test
  asserting ConfigInvalid for negative TOML stale_threshold_days
- rag: add 5 boundary unit tests pinning compute_stale mirror equivalence
- search: rewrite "Task 6" plan refs in lexical/vector to point at
  actual function names (mark_stale_in_place / RagPipeline::ask)
- cli: dedupe write_config / ingest / backdate_updated_at helpers
  from wire_search_stale + wire_ask_stale into tests/common/mod.rs
- tui: clarify inspect.rs uses same source-of-truth as SearchHit
- rag: PackedCitation.stale invariant doc comment
- HOTFIXES: log conscious decision on wire-schema required-field
  expansion (strict-validator concern)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:04:28 +09:00
th-kim0823
aeee7ed771 feat(cli): [stale] tag on plain ask citations (fb-32)
Mirror of Task 9's search-output rendering: yellow [stale] on TTY,
plain text otherwise. JSON path inherits via serde on AnswerCitation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 02:34:58 +09:00
th-kim0823
15cdc97cae feat(cli): [stale] tag on plain search output (fb-32)
Yellow when TTY, plain when not. JSON path inherits via serde
on the domain type; no CLI-side wire change needed there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 02:24:54 +09:00
th-kim0823
3328760dca fix(progress): one draw per file — drop set_message in TTY AssetStarted
set_draw_target switching broke cursor positioning: each hidden→stderr
restore caused indicatif to draw a fresh line instead of overwriting.
Root fix: call only set_position() in TTY AssetStarted (one draw per
file). Filename visible in non-TTY plain-line output.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 22:28:37 +09:00
th-kim0823
5be90cffec fix(progress): eliminate duplicate TTY frame per asset
set_position() and set_message() each call update_and_draw()
independently, producing two scrollback lines per file in TTY mode.
Suppress the draw target before the two updates, restore to stderr,
then call tick() to emit exactly one frame.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 22:15:01 +09:00
th-kim0823
cb266e0071 fix(progress): eliminate duplicate bar frame per asset in TTY mode
AssetStarted now advances position (idx-1) and sets message together.
AssetFinished no longer updates the bar — Completed handles final
cleanup via finish_and_clear. Result: one bar frame per file instead
of two, eliminating the scrollback duplicate-line artifact.
2026-05-07 21:49:47 +09:00
th-kim0823
0e762e6374 fix: rename leftover kbkebab in main.rs comments 2026-05-07 20:52:34 +09:00
th-kim0823
b230fbb495 fix: apply review nits — kb→kebab comment, quiet reset guard, ingest-stdin readonly test, README+SMOKE docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 19:58:56 +09:00
th-kim0823
6bedba4a7f test(fb-26,fb-28): integration tests for readonly/quiet flags and KEBAB_PROGRESS=plain
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 19:43:04 +09:00
th-kim0823
fd4125c0a0 feat(fb-28): --readonly/--quiet global flags + KEBAB_READONLY env + is_mutating guard
Add readonly/quiet fields to Cli, parse_bool_env for 1/true/yes/on support,
is_mutating guard that short-circuits with error.v1 on write-path commands,
and wire KEBAB_PROGRESS=plain through from_flags in the Ingest arm.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 19:38:30 +09:00
th-kim0823
4191347491 fix(fb-26): Completed TTY missing summary + Aborted unconditional writeln + quiet suppression in handle_human
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 19:33:57 +09:00
th-kim0823
dd33902f5a feat(fb-26): extend ProgressMode with quiet field, update from_flags signature
Add `quiet: bool` to `Human` variant and expand `from_flags` to three
args (`json`, `quiet`, `plain_env`). Update `handle`/`handle_human`
accordingly; add four targeted unit tests (TDD).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 19:31:01 +09:00
th-kim0823
ccee30037d 🧪 test(kebab-cli): update cli_mcp_smoke tools/list assertion 4 → 6 (fb-31)
fb-31 added ingest_file + ingest_stdin MCP tools (Task 9) but the
spawn-based smoke test in cli_mcp_smoke.rs still asserted the fb-30
count of 4. Bump to 6 to match the live tools/list response.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:26:51 +09:00
th-kim0823
fbc01eda50 🧪 test(kebab-cli): cli_ingest_file + cli_ingest_stdin integration (fb-31)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:10:09 +09:00
th-kim0823
0386adcb5e feat(kebab-cli): kebab ingest-stdin subcommand (fb-31)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:07:35 +09:00
th-kim0823
9cc7deca11 feat(kebab-cli): kebab ingest-file subcommand (fb-31)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:06:25 +09:00
th-kim0823
366b647a1a feat(kebab-app): capability flag mcp_server: false → true (fb-30)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:12:23 +09:00
th-kim0823
4a30959fdd feat(kebab-cli): kebab mcp subcommand (fb-30)
Wires kebab_mcp::serve_stdio into kebab-cli. `--config <path>` honored
via the established Config::load pattern.

Updated serve_stdio signature to (Config, Option<PathBuf>) so the doctor
tool's path-aware behavior works correctly via KebabAppState.

Smoke test spawns the binary + sends initialize + initialized +
tools/list over stdin, asserts 4 tools returned. Confirms the MCP
server boots end-to-end via the real binary (rmcp 1.6 has no
in-memory test transport, so this is the only end-to-end assertion).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:10:17 +09:00
th-kim0823
bc16dbf12a 🚑 fix(kebab-cli): add schema_version field to wire.rs ErrorV1 test literal
Task 8 commit f9a1548 added `schema_version: String` as required field on
ErrorV1 (so kebab-mcp's direct serialize-then-emit path produces correct
error.v1 wire). The wire.rs ErrorV1 literal in the
error_wrapper_tags_schema_version_and_emits_code test was missed —
breaks kebab-cli build. Add the field to the test fixture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:02:09 +09:00
th-kim0823
1f53930234 🏗️ refactor(kebab-app): promote error_classify → kebab-app::error_wire (fb-30 prep)
fb-30 의 새 crate `kebab-mcp` 가 동일 classify 모듈 사용 — UI crate 끼리
import 는 facade rule 위반이므로 kebab-app 으로 promotion. fb-27 commit
c91228e 의 코드 그대로 이전 (struct + classify + classify_llm + 7 unit
test). reqwest dev-dep 도 함께 이동.

kebab-cli 는 `kebab_app::ErrorV1` / `kebab_app::classify` 로 import 경로
1줄 변경 + wire.rs 의 `&crate::error_classify::ErrorV1` 1줄 교체. 동작
무영향.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 15:13:28 +09:00
th-kim0823
1bcca7f9ca 🏗️ refactor(fb-27): apply round 1 review nits
- schema.rs: extract `SCHEMA_V1_ID` const + re-export via kebab-app::lib.rs.
  wire.rs::wire_schema 의 2 literal 도 import 해서 single source of truth.
- schema.rs::collect_models: parser_version 가 markdown 만 surface 함을
  주석으로 명시 (PDF/image extractor 의 자체 version 은 SchemaV1.models 가
  multi-medium map 으로 진화 시 surface).
- main.rs::print_schema_text: 헤더 줄 끝의 `\n` 제거 + `println!()` 추가 —
  다른 section 들과 패턴 일관.
- error_classify.rs::llm_unreachable_classifies: timeout 50ms → 500ms (10x
  headroom) + 접근 방식 + 한계 주석 추가.
- HOTFIXES: open_existing 의 RW flag + 주석-only enforcement 갭을
  Known-limitation 에 명시.

Round 1 review summary: #104 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 13:17:50 +09:00
th-kim0823
3725986af7 🧪 test(kebab-cli): integration coverage for kebab schema + error.v1 (fb-27)
cli_schema: exercises `kebab schema` (text + --json) on a fresh-but-init'd
KB. Pins schema_version, kebab_version non-empty, capabilities.json_mode
true, capabilities.mcp_server false (future placeholder).

cli_error_wire: spawns `kebab --json --config <malformed.toml> ingest`
and verifies stderr emits a single error.v1 ndjson line with
code == "config_invalid". Non-JSON mode regression-pinned to keep the
legacy `error:` prefix. Note: --config /nonexistent silently falls back
to defaults (by design); a file that exists but fails TOML parsing is
the reliable trigger for config_invalid.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:33:46 +09:00
th-kim0823
912c7aa07d feat(kebab-cli): emit error.v1 ndjson on stderr in --json mode (fb-27)
Wraps the existing `Err(e)` arm with a `cli.json` branch:
- `--json`: stderr ndjson `error.v1` via wire_error_v1
- non-`--json`: legacy `error: <msg>` text path (unchanged)

exit_code() unchanged — RefusalSignal/NoHitSignal/DoctorUnhealthy
still drive 1/1/3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:28:41 +09:00
th-kim0823
4eb13c63ae feat(kebab-cli): kebab schema subcommand (fb-27)
Text mode: doctor-style key/value layout. JSON mode: schema.v1 wire
record. Honors `--config <path>` via the established
`kebab_app::schema_with_config(&cfg)` facade pattern (per the P3-5 /
P4-3 regression conventions).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:24:06 +09:00
th-kim0823
c91228e7d5 feat(kebab-cli): error_classify dispatcher + wire helpers (fb-27)
`error_classify::classify` maps anyhow::Error → ErrorV1 wire record by
downcasting to known typed errors (LlmError + ConfigInvalid + NotIndexed
re-exported from kebab_app::error_signal, plus std::io::Error chain).
Generic fallback emits `code: "generic"` with the chain in `details` when
verbose.

wire.rs adds wire_schema (idempotent re-tag, mirrors wire_doctor pattern
since SchemaV1 carries its own schema_version field) and wire_error_v1
(simple tag_object). Tests pin both wrappers + 7 classify code paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:11:54 +09:00
e4432a2388 review(p9-fb-25): 회차 1 nit 반영 — render_skipped_breakdown 단일 source + NO_EXT_SENTINEL + 카운트 + deprecation 문구
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:35:10 +00:00
44dee2c30f feat(kebab-cli, kebab-tui): p9-fb-25 task 6 — render skipped-by-extension breakdown
Append ": A docx, B txt, ..." after the N skipped count in both the
CLI ingest summary and TUI status_line terminal events (completed +
aborted). Breakdown is desc-sorted by count, ties broken by key
alphabetic; empty map produces no extra text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:16:43 +00:00
693f5582f0 feat(kebab-core, kebab-app): p9-fb-25 task 4 — IngestReport.skipped_by_extension + wire schema additive
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:06:34 +00:00
7f31721a47 refactor(kebab-cli, kebab-tui): p9-fb-25 task 2 — SourceScope via ..Default::default()
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:47:39 +00:00
b22c8cfd45 feat(kebab-config): p9-fb-25 task 1 — drop WorkspaceCfg.include + deprecation probe
- Remove `pub include: Vec<String>` from `WorkspaceCfg` struct (denylist-only model).
- Drop `include: vec!["**/*.md"]` from `Config::defaults()`.
- Add `from_file` deprecation probe: raw `toml::Value` scan fires a
  one-shot `tracing::warn!` (via `OnceLock`) when an old config still
  carries `workspace.include = [...]`. serde ignores the unknown field
  cleanly (no `deny_unknown_fields`).
- Compile-fix `kebab-cli` (main.rs:329) and `kebab-tui`
  (ingest_progress.rs:39): replace `cfg.workspace.include.clone()` with
  `Vec::new()` (Task 2 will switch to `..Default::default()`).
- Two new tests: `legacy_include_field_is_ignored_silently` (backward
  compat round-trip) + `workspace_cfg_has_only_root_and_exclude_fields`
  (exhaustive destructure — compile-time guard against re-introduction).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:44:35 +00:00
06aaae4eb8 feat(kebab-cli): p9-fb-23 task 8 — --force-reingest flag
Adds `--force-reingest` to the `ingest` subcommand and wires it
through `IngestOpts` into `ingest_with_config_opts`, bypassing the
per-asset early-skip path when set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 18:15:35 +00:00
0684b3ad66 review(p9-fb-23-task1): fix missed IngestReport construction sites + snapshot
reviewer-flagged: aa2a6ea claimed build clean but missed:
- crates/kebab-store-sqlite/tests/ingest_report_snapshot.rs (test fixture)
- crates/kebab-cli/src/wire.rs (test fixture)
- crates/kebab-store-sqlite/snapshots/ingest_report.snapshot.json (snapshot)

All three add `unchanged: 0` (or `\"unchanged\": 0`) to match the new
IngestReport.unchanged field. cargo clippy --workspace --all-targets
-- -D warnings now clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 17:47:13 +00:00