kebab

Author	SHA1	Message	Date
altair823	f9714aa5cb	docs(rename): kb → kebab — README, tasks/, docs/, design doc, report 마지막 commit. 모든 .md 안의 `kb` 단어 일괄 갱신. - 19 개 crate 이름 (`kb-core`, `kb-app`, …) → `kebab-` (Rust 모듈 path 표기 `kb_` → `kebab_` 포함). - 미래 component (`kb-tui`, `kb-desktop`, `kb-asr-whisper`, `kb-ocr`, `kb-mcp`, `kb-vlm`, `kb-rerank`, `kb-vision-ocr`, `kb-index`, `kb-smoke`, `kb-architecture`) → `kebab-` (P6+ 가 시작될 때 같은 prefix 사용). - CLI 명령 예제: `kb ingest` / `kb search` / `kb ask` / `kb init` / `kb doctor` / `kb inspect` / `kb list` / `kb eval` → `kebab <verb>`. fenced code block + 인라인 backtick 모두. - XDG paths + env vars + binary 경로 (`target/release/kb` → `target/release/kebab`) 동기화. - design doc / 최초 보고서 / SMOKE / HOTFIXES / phase epic / task spec 모든 reference 통일. - task-decomposition.md 의 `git -c user.name=kb` 는 과거 git history 기록용 author 정보라 그대로 유지 (실제 git history 의 author 는 변경 불가). - `tasks/phase-5-evaluation.md` 의 `status: planned` → `completed` 도 같이 (P5-1 + P5-2 PR 머지 후 미반영분). ## 검증 - `grep -rEn "\bkb-[a-z]\|\bkb_[a-z]\|\.config/kb\b\|kb\.sqlite\|\bKB_[A-Z]" --include="*.md"` 0 hits (task-decomposition.md 의 git author 제외). - 모든 file path reference 살아있음 (renamed file 들 모두 새 path 로 update). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 04:01:55 +00:00
altair823	ee1f2339dd	fix(p5-2): apply push-time review items — citation/refusal correctness + nits 두 reviewer 의 should-fix 4 건 + nit 5 건 push 전 반영. ## should-fix - `citation_coverage`: 빈 citations[] 가 `Iterator::all` vacuous-true 로 1.0 새는 거 차단 — `!is_empty() && all(non-empty path)` 로 변경. 또한 `_store: &SqliteStore` dead 인자 시그니처에서 제거 (호출 사이트 + 테스트 helper 정리). - `refusal_correctness`: lexical-only run 에서 `answer == None` 인 경우 분모 증가 안 함 (NaN/null 출력) — 자동 fail 처리하던 게 metric 의미를 왜곡함. 새 unit test `refusal_correctness_nan_for_non_rag_run` 추가. - `groundedness`: `must_contain.is_empty() && forbidden.is_empty()`인 golden 은 분모에서 제외. unconfigured entry 가 free 1.0 받지 않게. 새 unit test `groundedness_skips_unconfigured_goldens` 추가. - `kb-cli/Cargo.toml` rationale 코멘트 사실 오류 정정 — kb-eval → kb-app 의존이지 그 반대 아님. ## nits - `KB_EVAL_GOLDEN` / `DEFAULT_GOLDEN_PATH` 중복 — `metrics::` 의 `pub(crate)` 로 단일화, `runner` 가 import. - `render_report_md` 의 `{:?}` `ComparisonKind` → 명시적 lowercase 매핑 함수 (`win`/`loss`/`draw`/`regression`) — JSON 직렬화 컨벤션과 통일. - `extract_chunker_version` `None == None` 매치 silent 위험에 대한 defensive 코멘트. - `delta_null_when_either_nan` 테스트의 `let mut` suppress hack → struct update syntax 로 정리. - `empty_store` test helper + 매번 `mem::forget(tmp)` 죽은 코드 제거. ## 추가 spec doc `tasks/p5/p5-2-metrics-compare.md` deviations 섹션 4 항목 추가: - `kb-eval` crate-level `kb-app` dep — P5-1 inheritance, 새 모듈 surface 는 import 안 함. - `citation_coverage` 약화된 resolver — `document_exists_by_path` 기다리는 중. - `refusal_correctness` non-RAG 런 NaN. - `groundedness` no-check golden skip. ## 검증 - `cargo test -p kb-eval` 35/35 (18 unit + 2 loader + 8 integration + 7 runner; 새 3 unit test). - `cargo clippy --workspace --all-targets -- -D warnings` clean. - `compare_report_snapshot_matches_fixture` 변경 없이 통과 — 새 동작이 스냅샷 입력 (lexical-only, no must_contain, no should-refuse) 영향 없음. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 03:17:32 +00:00
altair823	d9a5b88d27	feat(p5-2): kb-eval metrics + compare — AggregateMetrics, CompareReport, kb eval CLI P5-2 구현. 저장된 eval_runs / eval_query_results 위에서: - `kb_eval::metrics`: hit@k / MRR / recall@k_doc / citation_coverage / groundedness / empty_result_rate / refusal_correctness 계산. NaN metrics (분모 0)는 JSON null. 4-decimal round + Deserialize 추가로 aggregate_json 라운드트립. - `kb_eval::compare`: 두 run 비교 → CompareReport (per-metric Δ + per- query Win/Loss/Draw/Regression). chunker_version drift 시 graceful doc-id fallback (chunker_version_match: "fallback_doc"), `strict` 옵션이면 refuse. - `render_report_md`: 인간용 Markdown (집계 + Wins/Losses/Regressions 표). - `SqliteStore::{load_eval_run, load_eval_query_results, update_eval_run_aggregate}` + owned `EvalRunRecord` / `EvalQueryResultRecord` 추가 — write 측 borrow-shape는 그대로. - `kb eval` CLI: `run` (P5-1 위임), `aggregate <id>`, `compare <a> <b> [--strict-chunker-version] [--write-report]`. `--json` 으로 raw CompareReport, 기본은 Markdown 출력. ## Spec deviations (intentional, doc 명시) - Graceful 매칭은 doc-id-only (chunker_version_match: "fallback_doc") — 50% span overlap은 chunker re-index 후 양쪽 chunks 동시 보존이 현실적으로 안 돼서 P6+ 로 deferred. - `*_with_config` 헬퍼 추가: 통합 테스트가 TempDir Config 로 드라이브. no-arg 형태는 Config::load(None) 로 위임. - CLI 는 kb-cli → kb-eval 직접 wire (kb-app cycle 회피). DoD 의 "via kb-app" 의도는 facade 단일화였지만 cycle 발생. - `AggregateMetrics: Deserialize` 추가 — aggregate_json 라운드트립. ## 검증 - `cargo test -p kb-eval` 30/30 (15 unit + 2 loader + 8 metrics+compare 통합 + 7 runner). 8 통합 중 snapshot 1 건 (`compare-1.json`). - `cargo test -p kb-store-sqlite` 33/33. - `cargo clippy --workspace --all-targets -- -D warnings` clean. - forbidden imports 부재 (`kb-source-fs\|kb-parse\|kb-normalize\|kb-chunk\| kb-store-vector\|kb-embed\|kb-search\|kb-llm\|kb-rag\|kb-tui\|kb-desktop\| kb-app` — kb-app 는 metrics/compare 모듈에 부재; runner 만 사용). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 03:05:13 +00:00
altair823	58a11cc2b8	feat(p5-1): kb-eval crate — golden-fixture runner + eval persistence - new kb-eval crate: load_golden_set (YAML) + run_eval (per-query search/ask + persistence) - new kb-store-sqlite::eval module: record_eval_run_with_results (transactional), document_exists / chunk_exists probes - fixtures/golden_queries.yaml: 5-entry KO+EN template - tests: 13 pass (loader: parse, dup-id, missing chunk_id; runner: elapsed, snapshot, error capture, JSONL, determinism, persistence, config_snapshot) - per_query.jsonl mirror written to runs_dir/<run_id>/ - temperature=0 + fixed seed → byte-identical per_query.jsonl (lexical) deviations from spec (documented in code): - run_id uses uuid::Uuid::now_v7().simple() (timestamp-ordered hex) instead of ULID — uuid already in workspace deps - load_golden_set_validated kept #[cfg(test)] pub(crate) — production inlines validate_against_db - snapshot fixture uses normalized projection (id/query/mode/first_hit) — full byte-determinism covered by separate test - index_version in config_snapshot left null (composed per call by kb-app, not config-level) deferred to follow-up: - App reuse across queries (currently rebuilds App per query) - expand_path hoist to kb-config (3 crate clones now) - --max-queries flag (deferred to P5-2 per updated spec) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 18:01:09 +00:00
kb	b999a12ab5	tasks: address PR #1 review - p3-3: SQLite-first/Lance-second + status marker (V003__embedding_status); drop "best-effort 2PC" misnomer - p4-3: replace print_stream FnMut closure with mpsc::Sender<String> (RagPipeline stays Send+Sync) - p4-3: tighten citation regex to strict [#<n>] only — reject [n]/prose/code-block false positives - p5-2: compare_runs across chunker_version is graceful (doc + span overlap fallback) with chunker_version_match audit field; --strict-chunker-version restores refusal - p7-1: per-page text via lopdf (pdf-extract has no per-page Rust API); use char count for spans - p8-1: explicit rubato (FftFixedIn) for 16 kHz mono resample; symphonia decode only - p9-5: drop cmd_read_pdf_page + pdfium native dep; cmd_read_file_bytes + frontend pdfjs; add traversal tests	2026-04-27 13:10:31 +00:00
kb	597a848af9	tasks: add P5 component specs (runner, metrics)	2026-04-27 12:04:06 +00:00

6 Commits