Commit Graph

405 Commits

Author SHA1 Message Date
893287a5a3 fix(config + tilde): LLM default → gemma4:e4b + workspace.root ~ expansion 일관성
도그푸딩 시 사용자 결정 (2026-05-02): 텍스트 LLM 기본을 gemma4 계열로
통일. OCR/caption 어댑터 (P6-2/P6-3) 가 이미 gemma4:e4b 사용 중 —
사용자가 한 family 만 pull 하면 ingest + ask 모두 작동.

같이 발견된 ~ expansion 불일치:
- kebab-source-fs::connector 는 expand_tilde 사용 (walk 정상)
- kebab-app::ingest_one_image_asset / ingest_one_pdf_asset 은 직접
  PathBuf::from → ~ 미확장 → ExtractContext 에 ~/KnowledgeBase
  그대로 전달
- kebab-tui::search::handle_key_search 의 editor jump 도 동일 →
  의미 없는 경로 spawn

Fix:
- Config::defaults().models.llm.model = \"gemma4:e4b\". OCR/caption
  family 통일 코멘트 추가.
- kebab-app 의 image / pdf 분기 두 곳 모두 expand_tilde 호출.
- kebab-tui::search jump 가 kebab_config::expand_path(.., \"\") 사용
  (expand_path 는 ~ / ${XDG_DATA_HOME} / {data_dir} 모두 처리하는
  정식 helper).

Caveat: kebab-app::expand_tilde 와 kebab-config::expand_path 가 별도
정의. 통합은 P+ task.

Docs (sync rule):
- README 사전 요구 절: gemma4:e4b 기본 + 더 큰 variant override 안내.
- docs/ARCHITECTURE 핵심 결정 표: LLM default qwen2.5:7b-instruct →
  gemma4:e4b.
- docs/SMOKE: ollama pull 예시 + KEBAB_MODELS_LLM_MODEL env 예시
  qwen2.5:32b → gemma4:26b.
- HOTFIXES: 새 entry (\"Config defaults: LLM = gemma4:e4b + workspace.root
  tilde expansion\").
- Memory: project_llm_default.md 신설, MEMORY.md 인덱스 추가.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 16:34:24 +00:00
8c6d29cc2d review(p9-4): 회차 1 지적 반영
blocks / embeddings 섹션의 count 라인이 collapse 검사 *밖* 에서 push
되어 collapsed 상태에서 부분만 사라지던 일관성 깨짐. fix: count 를
section header 에 inline 으로 (`▾ blocks (N)`, `▾ embeddings (N)`),
body 만 collapse 검사 안. 새 helper `push_section_header_with_count`
가 둘 다 통일.

회귀 테스트 보강:
- doc_view_collapse_hides_section_body: collapsed 상태에서 \"blocks (2)\"
  inline count 표시 + \"Heading L1\" body 숨김 검증.
- chunk_view_renders_text_and_block_ids: \"embeddings (2)\" inline
  count 검증.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 15:44:14 +00:00
b6e0ab352f feat(kebab-tui): P9-4 Inspect pane — doc/chunk detail with collapsible sections
Library Enter / Search 'i' 가 Inspect 진입. Doc 또는 Chunk 단일 view 로
metadata / provenance / blocks (doc) 또는 spans / text / embeddings (chunk)
6 section 을 collapsible 로 표시. Esc/q 로 originating pane 으로 복귀.

핵심:
- InspectTarget enum (`Doc(DocumentId) | Chunk(ChunkId)`).
- InspectState 본체 (`app.rs`) — target / doc / chunk / collapsed
  HashSet / scroll / return_to / needs_fetch / loading.
- `src/inspect.rs`:
  - `render_inspect` — target 종류별 render_doc / render_chunk 분기,
    section header 가 collapse marker (▾/▸) 표시. metadata.user JSON
    pretty-printed.
  - `handle_key_inspect`: j/k / Down/Up scroll. PageDown/PageUp 10 row.
    c = toggle all sections (v1 simplification). Esc/q = SwitchPane(return_to).
  - `enter_inspect(state, target, return_to)` helper — Library 와 Search
    공통 entry point.
  - run-loop hook `refresh_inspect` — needs_fetch 면 lazy
    inspect_doc_with_config / inspect_chunk_with_config.
- run.rs: Pane::Inspect arm 이 handle_key_inspect + render_inspect.
  Idle tick 마다 refresh_inspect. SwitchPane(Inspect) lazy init.
- Library: Enter 가 enter_inspect(Doc(selected)) 호출 후 SwitchPane.
- Search: 'i' (plain modifier) 가 enter_inspect(Chunk(selected_hit))
  호출 후 SwitchPane. typing 'i' (\"instance\") 와 충돌 가드.

테스트 12개 (`tests/inspect.rs`, TestBackend) — Esc 가 return_to 사용
/ q 도 동작 / j/k scroll bounds / PgUp PgDn ±10 / c 일괄 toggle / no
target hint / loading / doc view header+metadata+provenance+blocks /
collapse hides body / chunk view text+block_ids / no slot →
SwitchPane(Library) / enter_inspect helper sets fields.

Spec deviation (HOTFIXES `2026-05-02 P9-4`):
- `render_inspect<B: Backend>` generic 제거 (P9-1/2/3 와 동일).
- Search `i` 키 추가 (P9-2 spec 에 없었음, P9-4 retroactive 추가).
- `c` 일괄 collapse — spec 의 \"focus 기반 selective collapse\" 는 P+.

Docs (sync rule):
- README: TUI 행 \"4 패널\" + Quick start 코멘트.
- HANDOFF: 한 줄 요약 + Phase status (P9 3/5 → 4/5) + deviation 한 줄.
- HOTFIXES: P9-4 entry.
- tasks/p9/p9-4 status: completed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 15:41:11 +00:00
ad7bd7d309 review(p9-3): 회차 1 지적 반영
Esc 후 재질문 시 detached prior worker + 새 worker 동시 in-flight 가능
했음. Ollama endpoint 에 두 요청 동시 발사 → 응답 시간 두 배 + stream
혼동. spawn_ask_worker 진입 시 `s.thread.is_some()` 검사 추가, 이전
worker 가 still alive 면 Enter 무시. input bar 의 busy 텍스트 가 세
상태 (streaming / awaiting prior / idle) 분리 표시 — 사용자가 Enter
가 왜 안 먹히는지 즉시 확인.

회귀 테스트 `enter_with_detached_prior_thread_is_blocked` 추가 — never-
ending 더미 thread 를 hand-install 후 Enter no-op 검증, 종료 시 thread
take() 로 leak 명시 (test process 종료 시 OS 가 reap).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 15:27:39 +00:00
f08fefec1d feat(kebab-tui): P9-3 Ask pane — streaming answer + citation panel + explain toggle
P9-1 Library 의 ? 키 활성화. App.ask slot 채움 (parallel-safety contract
그대로). Worker thread 가 kebab-app::ask_with_config 호출하면서
AskOpts.stream_sink 로 token 을 mpsc 채널 에 보냄, 메인 스레드 (TUI) 는
매 render frame 마다 drain 으로 문자열 누적 → 답변 영역 이 token-by-token
업데이트.

핵심:
- AskState 본체 (`app.rs`) — input / explain / streaming / partial /
  answer / thread JoinHandle / rx Receiver / scroll / last_error.
- `src/ask.rs`:
  - `render_ask` — input bar / 답변 영역 (streaming 시 ▍ cursor) /
    bottom split (status: grounded/model/prompt/k/refusal · citations
    or explain panel).
  - `handle_key_ask`: typing → input. Enter → spawn_ask_worker (input
    있음 + not streaming). e (input empty 시) → toggle explain.
    j/k (input empty 시) → scroll. Esc → SwitchPane(Library) +
    streaming/rx/thread 클리어 (best-effort cancel).
  - `spawn_ask_worker` — mpsc::channel + thread::spawn(|| ask_with_config).
  - run-loop hooks: `drain_stream` (try_iter → partial), `poll_worker`
    (handle.is_finished → take + join → answer 채움 또는 ErrorOverlay).
- run.rs: Pane::Ask arm 이 handle_key_ask + render_ask. Idle tick 마다
  drain_stream + poll_worker. SwitchPane(Ask) 시 lazy init.

테스트 13개 (`tests/ask.rs`) — Esc/typing/backspace/e toggle (input
empty)/e typed (input nonempty)/Enter empty/Enter while streaming
no-op/render pre-submission hint/streaming partial+cursor/grounded
answer + citation [1]/refusal score_gate 패널 panic 없음/explain panel
title flip/no slot.

Spec deviation (HOTFIXES `2026-05-02 P9-3`):
- `render_ask<B: Backend>` generic 제거 — ratatui 0.28 Frame
  backend-agnostic (P9-1/P9-2 와 동일).
- e/j/k 가 input 빈 상태 일 때만 command 키, 입력 있으면 typing —
  vim "command vs insert" 변형. spec literal 의 단순 \"e=toggle\" 은
  \"explain\" / \"javascript\" 같은 단어 입력 깨뜨림.

Docs (sync rule):
- README: TUI 행 \"Library + Search + Ask 패널\" + Quick start 코멘트.
- HANDOFF: 한 줄 요약 + Phase status (P9 2/5 → 3/5) + deviation 한 줄.
- HOTFIXES: P9-3 entry.
- tasks/p9/p9-3 status: completed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 15:24:26 +00:00
0732b3ffbe review(p9-2): 회차 1 지적 반영
1. **Citation::Page 분기 fix** — `args.push(format!(\"# page {page}\"))` 가
   vim/code/cursor 에 \"두 번째 파일\" 로 해석돼 의도 외 동작 (split / new
   buffer). 마지막 push 제거, path 만 열고 `tracing::debug!` 한 줄.
   PDF 페이지 jump 는 사용자 PDF reader 책임 — `KEBAB_EDITOR_JUMP_FORMAT`
   env hook 은 P+ enhancement.
2. **j/k/g 의 SHIFT modifier 차단** — `is_typing_mod` 가 SHIFT 를 typing
   으로 취급하던 부분이 J/K/G 를 selection 키로 흡수해 \"JSON\" / \"PostgreSQL\"
   / \"Go\" 같은 대문자 검색어 깨짐. arrow 키 (Down/Up) 는 modifier 무관 유지,
   문자 키 (j/k/g) 는 `KeyModifiers::NONE` 만. SHIFT-J / SHIFT-G 회귀 테스트
   2건 추가.
3. **`format_hit_lines` 의 unused `_width` 인자 제거** — ratatui 자동
   truncate 신뢰 (Library 의 한국어 column 정렬은 별도 path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 14:42:49 +00:00
0490b6a126 feat(kebab-tui): P9-2 Search pane — input + dense hits + preview + editor jump
Library 의 / 키가 활성화. App.search slot 이 lazy 채워지고 (run loop 가 SwitchPane(Search) 받을 때),
debounce 200 ms 후 kebab-app::search 호출, 선택된 hit 의 chunk 를 preview pane 에 표시.
g 키로 $EDITOR (vim/nvim/code/cursor 자동 감지) 에서 citation 위치 열림.

핵심:
- SearchState 본체 (`app.rs` 의 forward decl 채움) — input / mode / hits /
  selected_hit / input_dirty_at / last_query / searching / preview.
- `src/search.rs` (신규):
  - `render_search(f, area, state)` — 3-pane layout (input bar / 결과 리스트 / preview).
    각 hit 는 §1.5 dense 4-line format (rank.score URI / heading / snippet).
  - `handle_key_search`: typing → input + dirty mark. Tab → mode 순환. Enter →
    immediate refresh. j/k → 선택 이동 + preview invalidate. g → editor jump
    (RAII raw-mode suspend). Esc → Library 복귀.
  - `build_jump_command(citation, editor_env, workspace_root)` 가 vim 류
    `+<line> path` / VS Code `code -g path:line` / cursor `cursor -g`
    자동 분기. unit test 로 잠금.
  - `jump_to_citation` 가 raw-mode + AltScreen 을 RAII 로 suspend/restore
    (panic 안전).
  - run-loop hook 4 함수: `debounce_due` / `fire_search` /
    `refresh_preview` (private to crate).
- run.rs:
  - Pane::Search arm 이 `handle_key_search` 로 dispatch + `render_search`.
  - SwitchPane(Search) 시 `app.search = Some(SearchState::default())` lazy init.
  - Idle tick 마다 debounce_due → fire_search, preview None → refresh_preview.
- 테스트 13개 (`tests/search.rs`) — Esc/typing/backspace/Tab cycle/Enter
  refresh/j-k 이동/jump cmd vim+code+args/render w/hits/empty render/no slot.

Spec deviation (HOTFIXES `2026-05-02 P9-2`):
- `render_search<B: Backend>` generic 제거 (P9-1 와 동일 사유 — ratatui 0.28
  Frame backend-agnostic).
- `jump_to_citation` 가 `workspace_root: &Path` 인자 추가. Citation.path 가
  workspace 상대 라 editor 호출 시 절대 경로 필요. spec literal 의 시그니처
  는 unimplementable.

Docs (sync rule):
- README: TUI 행 \"Library + Search 패널, ask/inspect 진행 중\" + Quick start
  의 `kebab tui` 코멘트 갱신.
- HANDOFF: 한 줄 요약 + Phase status (P9 1/5 → 2/5) + deviation 한 줄 추가.
- HOTFIXES: P9-2 entry 추가.
- tasks/p9/p9-2 status: completed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 14:38:17 +00:00
63c6d007ae review(p9-1): 회차 1 지적 반영
- p9-2/3/4 미머지 시점에 / ? Enter 키로 focus 가 Search/Ask/Inspect 로
  옮겨가면 헤더만 바뀌고 본문은 Library 그대로 + 키 매핑도 Library 라
  사용자에게 거짓말. footer hint 가 \"Search pane not yet implemented
  (lands with p9-2) — q to return\" 로 전환된다. 새 stub 핸들러
  `handle_key_unimplemented_pane` 가 q / Esc 만 받아 Library 로 복귀,
  나머지 키는 no-op (이전 구현은 handle_key_library 로 위임해서 focus
  와 다른 pane state 가 mutate 되던 절뚝거림 차단).
- `format_doc_row` 의 `{title:<title_w$}` 가 std::fmt 의 named-arg
  width specifier — 미래 reader 가 같은 패턴 보고 헷갈리지 않도록
  doc 링크 한 줄 코멘트 추가.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 13:31:19 +00:00
43ff4048e8 feat(kebab-tui): P9-1 Ratatui shell + Library pane
새 crate `kebab-tui` 가 §8 facade rule 따라 `kebab-app` 만 import.
Ratatui 0.28 + crossterm 0.28 기반 shell 이 다음을 제공:

- `App` 구조체: config + focus + library + 3 Option sub-state slot
  (search/ask/inspect — p9-2/3/4 가 자기 모듈에서 채우는 parallel-safety
  contract). p9-1 외에 App 정의 손대지 않음.
- `Pane` enum (Library/Search/Ask/Inspect/Jobs).
- `KeyOutcome` (Continue/Quit/SwitchPane/Refresh).
- `LibraryState` + 내부 inner: docs / list_state / filter / filter_edit /
  needs_refresh / loading / pending_g.
- `render_library` (Frame, area, &App) — heading/body, filter overlay
  toggleable, Korean/wide-char 너비는 unicode-width 로 계산.
- `handle_key_library`: j/k/Down/Up 이동, gg/G 끝, f 필터 overlay,
  /=>Search ?=>Ask Enter=>Inspect, q/Esc 종료. error overlay 가 켜
  있으면 어떤 키든 dismiss.
- 필터 overlay: tags_any (CSV) + lang 두 필드, Tab cycle, Enter
  apply→Refresh, Esc cancel.
- `ErrorOverlay`: anyhow chain 캡쳐 후 popup 렌더 (Clear + 빨간 border).
- 터미널 lifecycle: `TuiTerminal` 가 enter raw mode + alt screen,
  Drop 이 종료 시 (panic 포함) restore — 사용자 쉘 깨지지 않게.
- 비동기 없음: facade 호출은 main thread 동기. v1 의 brief hang 수용.

CLI: `kebab tui` 서브커맨드 추가, --config 받아 App::new + run.

테스트 10건 (`tests/library.rs`, TestBackend 사용):
- 빈 library / 3-doc render / q,Esc quit / / Search 전환 / ? Ask 전환
- Enter 빈 list 무동작 / Enter Inspect 전환 / j 이동 (3-step clamp) /
  f 필터 overlay → 입력 → Enter Refresh.

Test seam: `App::populate_library_for_testing` (#[doc(hidden)]) 가
`pub(crate)` inner 를 우회. spec parallel-safety contract 그대로 유지.

Spec deviation (HOTFIXES `2026-05-02 P9-1`):
- `render_library` 의 `<B: Backend>` generic 제거 — ratatui 0.28 의 Frame
  이 backend-agnostic.
- `populate_library_for_testing` 추가 (test seam, 공식 API 아님).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 13:26:24 +00:00
645bcdf9c0 review(vector-orphan): 회차 1 지적 반영
`delete_by_chunk_ids` 의 SQL IN(...) 입력에 대한 hex invariant 를
`debug_assert!` 로 명시. `id_for_chunk` 가 항상 hex 를 emit 하지만
`ChunkId(pub String)` 가 hand-construct 가능해 미래 contributor 가
tainted 문자열을 넣을 가능성 차단. dev / test build 에서 즉시
panic 으로 잡힘 (release 는 그대로 SQL 진행 — 운영 경로는 hex 가
강제되므로 false positive 없음).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 12:36:07 +00:00
0c8821f857 fix(kebab-store-vector): close P7-3 vector orphan caveat — delete_by_chunk_ids
P7-3 의 storage UNIQUE bug fix 가 SQLite 측 (documents → blocks /
chunks / embedding_records) 만 sweep 했음. LanceDB 의 vector 는 별도
store 라 옛 chunk_id 를 가진 row 가 디스크에 잔존. 검색에는 영향 없지만
디스크는 무한 누적. HOTFIXES `2026-05-02 P7-3` caveat 의 "P+ task" 약속을
같은 후속 PR 안에서 닫음.

변경:
- `VectorStore::delete_by_chunk_ids(&[ChunkId])` trait method 추가 (default
  no-op 제공 — 테스트 fake / 기존 impl 이 그대로 컴파일).
- `LanceVectorStore::delete_by_chunk_ids` 가 connection 의 모든
  `chunk_embeddings_*` 테이블을 순회 + `Table::delete("chunk_id IN (...)")`
  를 batch=200 단위로 실행. 다중 모델 워크스페이스 (마이그레이션 중간 등)
  에서도 안전.
- `SqliteStore::stale_chunk_ids_at(workspace_path, new_asset_id)` 가
  read-only SELECT 로 옛 chunk_id 들 반환. CASCADE 가 흐르기 *전* 에
  caller 가 호출.
- `kebab-app::purge_vector_orphans_for_workspace_path` 가 위 두 단계를
  orchestrate. 세 ingest path (markdown / image / pdf) 의
  `put_asset_with_bytes` 호출 직전에 한 줄로 호출.

Smoke 검증 (release binary, fastembed enabled):
- whitepaper.pdf 첫 ingest → chunk_ids = {f616…, 4e0f…}, vector store 에
  그 두 ID 의 row 존재.
- byte 변경 후 re-ingest → 새 doc_id (3741…) + 새 chunk_ids
  (ed0c…, e13c…). vector search "REWRITTEN chapter two" → 새 chunk_ids 만
  hit. 옛 query "Edited page two body" 시도해도 옛 chunk_ids 는 vector
  store 에 더 이상 없음 (의미적으로 가장 가까운 새 chunks 가 hit).

HOTFIXES `2026-05-02 P7-3` 의 \"vector store cleanup\" 항목이 \"deferred\" →
\"closed by follow-up PR\" 로 갱신. SMOKE.md 의 알려진 동작 (\"옛 vector
잔존\") 도 \"두 store 정합\" 으로 갱신.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 12:32:29 +00:00
3a57cab1eb fix(kebab-store-sqlite): purge stale assets row on workspace_path orphan + smoke
P7-3 통합 테스트가 노출한 storage 레이어 버그 fix.
`assets.workspace_path` 의 UNIQUE 제약과 `upsert_asset_row` 의
`ON CONFLICT(asset_id)` 만 처리하던 gap 사이 — byte 가 변경된 자산
re-ingest 시 새 asset_id 가 같은 workspace_path 에서 secondary UNIQUE
충돌. md / image / pdf 모두 영향.

Fix:
- 새 helper `purge_orphan_at_workspace_path` 가 같은 `workspace_path`
  의 *다른* `asset_id` 를 발견하면 documents → assets 순서로 sweep.
  documents 의 ON DELETE RESTRICT 회피 + CASCADE 로 blocks / chunks /
  embedding_records 정리. copied 모드면 storage_path 의 byte 파일도
  best-effort 삭제.
- `put_asset_with_bytes` 의 두 분기 (copy / reference) + `DocumentStore
  ::put_asset` 모두 호출.
- 회귀 테스트 `put_asset_with_bytes_sweeps_workspace_path_orphan` (이전
  의 "UPSERT 실패시 orphan 청소" 테스트가 더 이상 doable 하지 않으므로
  대체).
- `re_ingest_edited_pdf_produces_new_doc_id` integration `#[ignore]` 해제 →
  9 통합 테스트 모두 default 로 통과.

Vector store orphan 은 별도 P+ task — LanceDB 가 SQLite cascade 와 무관하게
운영되므로 stale chunk_id vector 가 디스크에 남음. 검색에는 영향 없음 (search 가
SQLite join 통해 surface).

Smoke 검증 (release binary, markdown 2 + image 1 + PDF 2):
- doctor pass
- 첫 ingest: 5 new
- list docs: 5 docs all media types
- search lexical "pdf-page-v1 chunker" → whitepaper.pdf hit
- search hybrid → cross-media 결과
- inspect doc PDF: parser_version=pdf-text-v1, blocks 가 SourceSpan::Page
- 동일 byte re-ingest: 5 updated, 0 errors (P1 idempotency)
- byte 수정 후 re-ingest: 1 new (해당 PDF) + 4 updated, 0 errors (storage fix)
- corrupt PDF 추가: errors+=1 + IngestItem.error 메시지 정확, 다른 자산 영향 0
- 정리 후 다시 ingest: errors=0
- RAG ask: PDF 인용 + `citations[].citation` 에 `kind: "page"` + `page: <N>` +
  `path: <pdf_path>` 정확히 노출

운영 fixture 보조:
- `crates/kebab-parse-pdf/examples/gen_smoke_pdf.rs` — `cargo run --release
  --example gen_smoke_pdf -p kebab-parse-pdf -- <out.pdf> <text-pages>` 로
  reportlab/qpdf 없이 in-tree PDF 생성.
- `crates/kebab-parse-image/examples/gen_smoke_png.rs` — 동일 방식의 PNG
  fixture 생성.
- SMOKE.md 가 두 example 사용법 + 갱신된 HOTFIXES 동작 (byte 수정 시
  errors+=1 → new+=1) 반영.

HOTFIXES `2026-05-02 P7-3` entry 가 \"deferred\" → \"fixed in same PR\" 로
업데이트, vector store orphan caveat 만 남음.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 11:41:23 +00:00
4ad4ef271e review(p7-3): 회차 1 지적 반영
- `IngestItem.warnings` 가 PDF path 에서 빈 vec 였던 갭 해소. P7-1 의
  Provenance Warning (scanned candidate / extract panic 흡수) 노트들을
  `IngestItem.warnings` 로 surface — md path 의 `fm_warns + blk_warns`
  patten 과 평행. 사용자가 ingest summary 에서 "이 PDF page 2 가 스캔
  이라 검색 불가" 를 즉시 확인 가능.
- `mixed_page_pdf_stores_asset_with_scanned_candidate_warning` 에
  `IngestItem.warnings` 단정 추가 (정확히 1건 + 노트 내용 검증).
- `encrypted_pdf` / `corrupt_pdf` 테스트의 `errors >= 1` → `errors == 1`
  strict 단정. 미래에 다른 source 가 errors 늘리면 즉시 빨개짐.
- `re_ingest_identical_pdf` 에 `chunk_count` 동일성 단정 추가. P1
  idempotency contract 의 chunk-단위 axis 검증 (chunk_id 전체 set 비교는
  pdf-page-v1 의 `deterministic_chunk_ids_1000` 가 잠그고 있어 chunk_count
  가 가벼운 proxy 로 충분).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 09:31:55 +00:00
5f3a37cafa feat(kebab-app): P7-3 PDF ingest wiring — kebab ingest 가 PDF 자산도 처리
P7-1 (`PdfTextExtractor`) + P7-2 (`PdfPageV1Chunker`) 의 라이브러리를
`kebab-app::ingest_with_config` 에 와이어링. `kebab-source-fs` 가 이미
`*.pdf` 를 `MediaType::Pdf` 로 분류하던 자산이 이제 검색 가능한 doc 으로
색인됨. P6-4 image wiring 패턴과 평행 — `ingest_one_asset` 에 `MediaType::Pdf`
arm 추가, 새 private fn `ingest_one_pdf_asset` 로 분기.

핵심 동작:
- per-medium chunker 선택: PDF 자산은 `PdfPageV1Chunker` 하드코딩 (compile-time
  match 기반). `config.chunking.chunker_version` 은 markdown 만 represent —
  PDF 는 항상 `pdf-page-v1`. HOTFIXES entry `2026-05-02 P7-3` 에 deviation 기록.
- encrypted PDF / corrupt PDF → `errors+=1` + P7-1 의 `qpdf --decrypt` hint
  를 `IngestItem.error` 에 verbatim 보존.
- 빈/scanned candidate 페이지 → 0 chunk, P7-1 의 `Provenance::Warning` 그대로
  통과. v1 에서는 검색 불가, P+ scanned-PDF OCR fallback 대기.
- determinism stress: extract → chunk 사이 `now()` 추가 호출 없음 (P6-4 invariant
  계승). PDF doc/chunk_id 모두 결정적.

통합 테스트 (`tests/pdf_pipeline.rs`, 8 passed + 1 ignored):
- 3-page text PDF → 1 doc + 3 chunk + Page span 검증
- identical re-ingest → Updated, doc_id 동일
- encrypted PDF → Error + `qpdf` hint 보존
- corrupt header PDF → Error + 미저장
- mixed page (page 2 빈) → 2 chunk + Warning 1개
- IngestReport 산술 invariant
- 50-page 긴 PDF → ≥50 chunk
- inspect doc → SourceSpan::Page round-trip
- (ignored) edited bytes re-ingest → storage UNIQUE bug 노출, P+ fix 대기

추가 발견 (HOTFIXES `2026-05-02 P7-3`): `assets.workspace_path` 의 UNIQUE
제약과 `upsert_asset_row` 의 `ON CONFLICT(asset_id)` 만 처리하는 부분 사이에
gap 존재. byte 변경 시 새 asset_id → 같은 workspace_path 충돌. md / image / pdf
모두 영향. P7-3 통합 테스트가 처음 노출. 본 PR 은 fix 안 함 — P+ storage task.

`docs/SMOKE.md` 에 PDF 섹션 + 검증 체크리스트 + 알려진 동작 4건 추가.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 09:28:06 +00:00
8181fd91e7 review(p7-2): 회차 1 지적 반영
- chunk 진입부에 overlap clamp 추가 (`target_bytes / 2` 상한). 병적 정책
  (`overlap_tokens >= target_tokens`) 에서 chunk 가 직전 chunk 의 텍스트를
  완전히 재발행하던 위험 차단. md-heading-v1 의 `seed_budget = overlap_tokens
  .min(target/2)` 가드 패턴과 일치. 회귀 테스트 `overlap_clamped_when
  _overlap_exceeds_target` 추가 — `actual_start` 가 인접 chunk 사이에
  엄격 증가하는지 검증.
- `char_start as u32` / `char_end as u32` silent truncation → `try_from
  ::expect` 로 corrupted input 시 명시 panic.
- 모듈 doc 의 `## Splitting policy` 에 약어 케이스 (`Mr.` / `i.e.` 등) +
  overlap clamp 두 항목 명시.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 08:55:07 +00:00
7ee0ac9894 feat(kebab-chunk): P7-2 pdf-page-v1 chunker — page-aware splitting
`PdfPageV1Chunker` 가 `kebab-parse-pdf` 가 emit 한
`CanonicalDocument` (블록당 한 페이지, 모두 `SourceSpan::Page`) 를 받아
페이지 경계를 절대 넘지 않는 `Chunk` 들을 생성. `chunker_version =
"pdf-page-v1"`.

핵심 동작:
- 페이지 텍스트가 `target_tokens × BYTES_PER_TOKEN` (= 3) 안이면 한
  덩어리. 초과 시 `\n\n` (paragraph) 또는 sentence-end 구두점 + whitespace
  경계를 segment 로 보고 greedy 누적, 기본 한 chunk 당 최소 한 segment.
- 다음 chunk 의 prefix 에 `overlap_tokens × BYTES_PER_TOKEN` 만큼의 직전
  꼬리를 prepend (char 단위, 이전 chunk 시작 너머로 backtrack 안 함).
- 빈/공백-only 페이지는 0 chunk (페이지의 `Provenance::Warning` 으로
  `kebab-parse-pdf` 단계에 이미 표시됨).
- 비-PDF doc (Block::Paragraph 가 아니거나 SourceSpan 이 Page 아님) →
  명시 에러.

Spec deviation (HOTFIXES 2026-05-02 P7-2):
- `chunk_id` 충돌 가드: 같은 페이지에서 여러 chunk 가 나오면 `block_ids`
  가 모두 같아 §4.2 recipe 가 충돌. `id_for_chunk` 의 `policy_hash` 인풋을
  per-chunk 로 `format!("{base}#c{char_start}")` 변형해 회피. recipe 자체는
  불변. `Chunk.policy_hash` 필드는 base 유지.
- `BYTES_PER_TOKEN = 3` (md-heading-v1 실제 코드와 일치). spec 본문은
  "/ 4" 라고 했지만 그 자체가 md-heading-v1 의 실코드와 어긋나 있어 일관성
  쪽을 택함. cross-chunker `policy_hash` 동일성 unit test 로 잠금.

테스트 (10개 신규):
- chunker_version label, 3-page small, 1-page huge + overlap + chunk_id
  유일성, empty page skip, whitespace-only skip, non-PDF error,
  cross-page boundary 절대 안 만들어짐, determinism (1000회), snapshot
  shape 안정, md-heading-v1 와 policy_hash 동일.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 08:51:44 +00:00
8de08cf38c review(p7-1): 회차 1 지적 반영
- Cargo.toml: 사용하지 않는 deps 제거 (`kebab-config`, `thiserror`,
  `pdf-extract`, dev `tempfile` / `serde_json` / `serde`). 특히
  `pdf-extract` 가 끌어오던 transitive ~150 crate (pom, postscript,
  type1-encoding-parser, adobe-cmap-parser, euclid, chrono, md5,
  linked-hash-map …) 가 모두 사라짐. lopdf 만 남음.
- info.rs: BOM 없는 PDFDocEncoded Title 디코드 버그 수정. `from_utf8_lossy`
  는 0x80–0xFF 를 U+FFFD 로 치환해 "Café" 같은 레거시 타이틀을 망가뜨림.
  byte → `char` 직접 캐스팅 (Latin-1 디코더) 로 교체. 회귀 테스트
  `info_dict_title_pdfdocencoding_latin1_high_bytes_decoded` 추가.
- info.rs: 모듈 doc 의 "Latin-1 superset" 부정확 표현 정정 — PDFDocEncoding
  은 0x18–0x1F / 0x80–0x9F 영역에서 Latin-1 과 다름.
- lib.rs: `saturating_sub(1)` 가 page=0 케이스를 silent 흡수하던 부분에
  `debug_assert!` 추가. release 는 saturating fallback 유지 (panic 보다
  garbled order 가 운영에 유리).
- tests: UTF-16 surrogate pair 커버리지 갭 보완 — 🥙 (U+1F959) 가 포함된
  타이틀로 `String::from_utf16_lossy` 의 페어-결합 경로 검증.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 08:40:40 +00:00
5a158d7343 feat(kebab-parse-pdf): P7-1 text PDF extractor — per-page CanonicalDocument
`PdfTextExtractor`(MediaType::Pdf) lopdf 기반 per-page 텍스트 추출.
페이지마다 `Block::Paragraph` + `SourceSpan::Page { page, char_start, char_end }`
emit. 본문이 비거나 추출 panic 인 페이지는 빈 paragraph + `Provenance::Warning`
("scanned candidate") 로 표시 — 이후 OCR fallback (별도 task) 의 입력.

핵심 동작:
- `lopdf::Document::load_mem` + `is_encrypted()` → 암호화 PDF 는 명시 에러
  (`qpdf --decrypt` 안내).
- 페이지 단위 `extract_text(&[page])` 를 `catch_unwind` 로 감싸 malformed
  page panic 을 recoverable warning 으로 변환.
- `/Info` dict 에서 Title/Producer/Creator best-effort 추출. UTF-16BE BOM
  prefixed 문자열도 디코드 (한국어 등 non-ASCII Title 정상 처리).
- 9개 통합 테스트: 3-page emit, scanned-mixed warning, encrypted refuse,
  corrupt header error, page_count 메타, UTF-16BE Title, filename
  fallback, determinism, snapshot.

`parser_version = "pdf-text-v1"`. Allowed deps: `lopdf 0.32` + `pdf-extract 0.7`
(원본 spec 그대로). 본문 다국어 OCR fallback 은 §9.2 후속 task (out of scope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 08:34:55 +00:00
6e4884aff8 fix(kebab-app): IngestReport.errors double-count regression — increment only in match item.kind { Error => ... } arm
수동 스모크 검증 (12 PNG + 손상 PNG) 중 발견. `IngestReport.errors`
가 자산 한 장당 2회 증가해서 `scanned = new + updated + skipped +
errors` invariant 가 깨짐:

- `garbage.png` (이미지 아닌 바이트, .png 확장자만) 1장 + 정상 자산
  3장 → 기대 `scanned=4 errors=1`, 실제 `scanned=4 errors=2`.
- 원인: `match item { Err(e) => { error_count += 1; IngestItem {...} }
  }` 에서 1회 증가 후, 직후 `match item.kind { Error => { error_count
  += 1 } }` arm 에서 또 1회 증가.
- markdown 경로의 `ingest_one_asset` Err 가 거의 발생 안 해서 P6-4
  머지 전까지 표면화 안 됐던 기존 결함. 이미지 dispatch 가 garbage
  bytes 를 Err 로 흘려보내며 처음으로 노출.

수정: `Err(e)` 분기의 `error_count.saturating_add(1)` 제거. 단일
증가 지점은 `match item.kind { Error => ... }` arm. 코멘트로 의도
명시.

회귀 테스트 추가 (`tests/image_pipeline.rs`):
- `garbage_png_increments_errors_counter_exactly_once` — 정확히 1
  증가 + `scanned == new + updated + skipped + errors` invariant
  검증.

검증 — release binary + 실 Ollama (192.168.0.47 / gemma4:e4b):

```
$ kebab --json ingest
scanned=4 new=3 updated=0 skipped=0 errors=1
  error    garbage.png       (extract Err — unrecognised format)
  new      intro.md
  new      normal.png        (OCR success)
  new      truncated.png     (OcrFailed warning, asset still indexed)
```

cargo test --workspace --no-fail-fast -j 1 — 전부 pass.
cargo clippy --workspace --all-targets -- -D warnings — pass.
cargo test -p kebab-app --test image_pipeline — 6 pass (5 기존 + 1 회귀).
2026-05-02 08:13:41 +00:00
469a1a34ec review(p6-4): 회차 1 지적 반영
- src/lib.rs:
  • `ingest_one_asset` 의 doc-comment 가 새 `ImagePipeline` struct 와
    합쳐지던 (rustdoc 가 두 doc 을 struct 의 것으로 합치던) 문제
    해소 — 두 doc-comment 위치 교환 + 빈 줄 분리.
  • `if let Some(Block::ImageRef(...)) = blocks.first_mut()` 의
    silent-skip 분기를 `match` 의 `other` arm 으로 명시 — 미래에
    P6-1 contract 가 깨지면 `tracing::warn!` + Provenance Warning +
    `IngestItem.warnings` 에 \"ImageDispatchAnomaly\" 노트로 즉시
    가시화. 운영 디버깅 단서 제공.
  • OCR 실패 분기 + caption 실패 분기의 ~25줄 boilerplate 를
    `record_image_analysis_failure` 헬퍼로 추출 — 두 호출이 한 줄로
    줄고 미래 ProvenanceEvent 필드 변경이 한 곳에서 끝남.
  • 분석 단계 Warning 이벤트가 fn 진입 시 캡처한 단일
    `OffsetDateTime::now_utc()` 를 공유 — spec Risks/notes 의
    \"Determinism stress: must not introduce a second `now()` call
    between extract and apply_ocr/caption\" 약속 회복.
  • 경고 라벨을 markdown 경로의 `WarningKind` 컨벤션 (`{kind}: {note}`)
    에 맞춤 — `\"ocr_failed: ...\"` → `\"OcrFailed: ...\"`,
    `\"caption_failed: ...\"` → `\"CaptionFailed: ...\"`. 같은 wire
    필드 (`IngestItem.warnings`) 가 두 갈래의 다른 형식을 갖던
    inconsistency 해소.
- tests/image_pipeline.rs:
  • 회귀 테스트의 \"ocr_failed\" assertion 을 \"OcrFailed\" 로 갱신.

cargo test -p kebab-app -p kebab-chunk — 전부 pass.
cargo clippy --workspace --all-targets -- -D warnings — pass.
2026-05-02 07:42:44 +00:00
ca0567c72b feat(kebab-app): P6-4 image ingest wiring — kebab ingest 가 PNG/JPEG 자산도 처리
P6-1/P6-2/P6-3 의 라이브러리 (`ImageExtractor`, `OllamaVisionOcr`,
`apply_caption`) 가 그동안 CLI 에서 보이지 않던 미완 구간을 완성.
이제 `kebab ingest` 가 markdown 외에 이미지 자산을 end-to-end 로
색인하고, `kebab search` / `kebab ask` 가 OCR 텍스트 + caption 으로
이미지를 매칭/인용한다.

## kebab-app

- `[dependencies]` 에 `kebab-parse-image` 추가.
- `ingest_with_config` 진입 시 `image.ocr.enabled` / `image.caption.enabled`
  플래그에 따라 `OllamaVisionOcr` / `OllamaLanguageModel` 을 **ingest
  세션당 1회** 빌드. 자산 루프에서 trait object 로 공유.
  reqwest::blocking::Client 의 내부 Arc 덕분에 알로케이션 비용은
  자산 수와 무관.
- 두 어댑터 + ImageExtractor 를 한 묶음으로 `ImagePipeline` 구조체에
  담아 `ingest_one_asset` 매개변수 폭증 차단 (clippy::too_many_arguments
  대응).
- `ingest_one_asset` 의 markdown-only 가드를 `match media_type` 으로
  교체 — Markdown 은 기존 경로, Image(_) 는 새 `ingest_one_image_asset`
  로 분기, PDF/Audio/Other 는 종전대로 skipped.
- 신규 `ingest_one_image_asset`:
  - bytes 읽기 → `ImageExtractor::extract` (실패 시 caller 가 errors+=1)
  - `apply_ocr` (Lenient — 실패 시 ProvenanceKind::Warning 이벤트 +
    `IngestItem.warnings` 에 \"ocr_failed: ...\", `block.ocr` 는 None
    유지)
  - `apply_caption` (동일 Lenient 정책)
  - 기존 `MdHeadingV1Chunker` 호출 — 청커는 이미 `Block::ImageRef` 를
    단일 청크로 emit
  - 기존 persist + embed 시퀀스 그대로 (markdown 과 byte-identical)
- `lang_hint_from_doc` — `Lang(\"und\")` 또는 빈 문자열을 None 으로
  매핑 (image-pipeline 어댑터의 build_prompt 가 \"und\" 를 silent drop
  하지 않도록 caller 측에서 미리).

## kebab-chunk

- `render_block_text` 의 `Block::ImageRef` 분기를 P6-4 (β) plain
  concat 정책으로 교체 — `[alt, ocr.joined, caption.text]` 를 `\\n\\n`
  로 join, 빈 부분은 drop. alt 가 비면 `src` 의 basename 으로 fallback
  (P6-1 contract 의 defensive guard).
- 신규 unit 테스트 `image_ref_p6_4_plain_concat_drops_empty_parts` —
  alt-only / alt+ocr / alt+caption / alt+ocr+caption / 빈 alt → src
  fallback 다섯 케이스 모두 검증.
- 기존 `image_ref_emits_own_chunk_zero_tokens` 그대로 통과 — 청커의
  per-block dispatch 는 변경 없음, text 렌더링만 갱신.

## 통합 테스트 (kebab-app/tests/image_pipeline.rs)

wiremock 으로 Ollama 를 stub. 5건:

1. OCR-only happy path — 1 PNG + ocr.enabled → 1 doc + 1 chunk emit,
   `block.ocr.joined` 가 mock 의 \"Hello World 2026\".
2. OCR + caption 동시 활성 — 두 필드 모두 채워지고 chunk text 에
   alt + ocr + caption 세 부분 모두 포함.
3. Lenient 실패 검증 — OCR 503 시 자산은 indexed (kind=New),
   `errors=0`, ProvenanceKind::Warning attributed to \"kb-app\",
   `IngestItem.warnings` 에 \"ocr_failed:\" 노트.
4. 양쪽 비활성 — `image.ocr.enabled=false && image.caption.enabled=false`
   여도 자산은 chunk 1개로 indexed (chunk text=filename), EXIF +
   dimensions 그대로 채워짐.
5. 결정성 (re-ingest) — 동일 PNG 두 번 ingest 시 두 번째는
   `Updated` + 동일 `doc_id`.

## SMOKE.md

`kebab search --mode lexical \"Hello World\"` 단계를 명령 시퀀스에
추가. `[image.ocr]` / `[image.caption]` config 절 예시 + ingest 시간
추정 (자산당 ~5-10초) 추가. \"책은 P7 PDF 라인으로\" 가이드를 검증
체크리스트 와 \"알려진 동작\" 양쪽에 박음.

## 실 Ollama 통합 검증

192.168.0.47 + gemma4:e4b 기준:

```
$ kebab --config /tmp/kebab-smoke/config.toml ingest
scanned 2  new 2  updated 0  skipped 0  errors 0  (18395 ms)

$ kebab inspect doc <image_doc_id>
parser_version: image-meta-v1
blocks: [{
  alt: \"hello.png\",
  ocr: \"Hello World 2026\",
  caption: \"The image displays the text \\\"Hello World 2026\\\" in a large, black, sans-serif font.\"
}]

$ kebab --json ask \"Hello World 텍스트가 어디에 있나?\" --mode hybrid
grounded: true
citations: [{marker: \"[1]\", doc_path: \"hello.png\"}]
```

## 검증

- `cargo test --workspace --no-fail-fast -j 1` — 전부 pass
- `cargo clippy --workspace --all-targets -- -D warnings` — pass
- `cargo test -p kebab-chunk image_ref` — 2 pass (P1-5 회귀 + P6-4
  신규 unit)
- `cargo test -p kebab-app --test image_pipeline` — 5 pass

## 의존성 경계

- `kebab-app` 이 `kebab-parse-image` 추가 — spec Allowed dep 그대로.
- 새 forbidden 침범 없음 (기존 `kebab-tui` / `kebab-desktop` /
  `kebab-eval` 미참조 유지).
- 본 task 가 신설하는 image-specific 비즈니스 로직 0줄 — 모두
  `kebab-parse-image` 에 위임.

`tasks/p6/p6-4-image-ingest-wiring.md` status: planned → completed.

contract: docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
sections: §3.4 ImageRefBlock, §6.1 ingest pipeline, §7.2
Extractor/Chunker traits, §9.1 image extraction policy.
2026-05-02 07:37:56 +00:00
b40b0b3992 review(p6-3): 회차 2 — image_prep 회귀 테스트 + doc 일반화
- src/image_prep.rs:
  • 신규 unit 테스트 6건 — PNG passthrough (zero-decode + 바이트
    동일성), JPEG → PNG 재인코딩, 1px 후행 클램프 (max=1601 / long=4001
    irrational scale), aspect ratio (4:3 보존, 2% 이내), 손상 PNG
    Err, 인식 불가 바이트 Err.
  • 모듈 doc-comment 의 \"send to vision models\" 표현을 \"image-to-LM
    pipeline / channel\" 으로 일반화. 미래 PDF / video keyframe 등
    호출자가 doc 만 보고 호출 의도 파악 가능.

cargo test -p kebab-parse-image — 48 pass + 2 ignored
  (19 unit (+6 image_prep) + 12 P6-1 + 8 P6-2 + 9 P6-3).
cargo clippy -p kebab-parse-image --all-targets -- -D warnings — pass.
2026-05-02 06:14:27 +00:00
9c644245fb review(p6-3): 회차 1 지적 반영
- 새 모듈 `crates/kebab-parse-image/src/image_prep.rs` — OCR + caption
  + 향후 PDF/video 가 공유할 단일 다운스케일 헬퍼 (`downscale_to_png`)
  추출. 기존 ocr.rs / caption.rs 의 거의 동일 알고리즘 두 벌을 한
  곳으로 통합. 1px 후행 클램프 / PNG passthrough hot path / 에러
  메시지 패턴이 한 곳에서 관리됨.
- src/ocr.rs: `downscale_to_long_edge` 제거 → `image_prep::downscale_to_png`
  호출. `image::ImageReader / ImageFormat / Cursor` import 도 정리.
- src/caption.rs:
  • `caption_image` / `apply_caption` 의 disabled 처리 비대칭 해소.
    `caption_image` 는 raw 연산 (gate 없음), `apply_caption` 만
    `cfg.image.caption.enabled` 게이트 검사. 호출자가 같은 함수에서
    같은 의미를 얻음.
  • `apply_caption` 의 caption.model / model_version `String::clone`
    2회 → 0회. caption move 전에 ProvenanceEvent.note 를 먼저 빌드.
  • 다운스케일 로직 통째로 image_prep 위임.
  • `MIN_CAPTION_LONG_EDGE` / `MAX_CAPTION_LONG_EDGE` 를 `pub const`
    로 노출 (P6-2 의 `MAX_DECODE_DIM` 가시성 컨벤션과 일관).
- tests/caption.rs:
  • `caption_image_errors_when_feature_disabled` 를
    `caption_image_runs_regardless_of_enabled_flag` 로 교체 — 새
    책임 분리 의미 검증.
  • `caption_image_clamps_oversized_max_pixels` 가 literal 1536 대신
    `kebab_parse_image::caption::MAX_CAPTION_LONG_EDGE` 상수 참조.
- tasks/HOTFIXES.md: `model_version` 형태 deviation 한 단락 추가
  (spec literal `provider` → `<provider>/<prompt_template_version>`
  확장 + 사유).

cargo test -p kebab-parse-image — 42 pass + 2 ignored
  (13 unit + 12 P6-1 + 8 P6-2 + 9 P6-3).
cargo clippy --workspace --all-targets -- -D warnings — pass.
2026-05-02 06:11:56 +00:00
cd2213e48d feat(kebab-parse-image): P6-3 caption adapter — vision LM via trait
- 신규 모듈 `crates/kebab-parse-image/src/caption.rs` 추가:
  • `caption_image(llm, bytes, lang_hint, cfg)` — `&dyn LanguageModel`
    위에서 동작. 비전 LM (예: gemma4:e4b) 이 한 문장 객관 설명
    출력. temperature=0 / seed=0 결정성.
  • `apply_caption(llm, bytes, block, lang_hint, cfg, events)` —
    `block.caption = Some(...)` 으로 채우고 ProvenanceKind::CaptionApplied
    이벤트 1건 추가. `image.caption.enabled = false` 면 클린 no-op
    (Ok(())). LM 실패 시 block.caption None 그대로 + events 미기록.
  • 다운스케일 long-edge `[128, 1536]` 클램프. PNG passthrough hot
    path 보존, 그 외는 단일 디코드 + PNG 재인코딩.
  • 한국어 / 영어 프롬프트 분기 (lang_hint=\"ko\"/\"kor\" → 한국어).
  • `ModelCaption.model_version = \"<provider>/<prompt_template_version>\"`
    (예: \"ollama/caption-v1\") — prompt 또는 모델 회귀 감사 가능.

## kebab-core / kebab-llm-local 변경

- `kebab_core::GenerateRequest` 에 `images: Vec<String>` 필드 추가.
  `#[serde(default)]` 으로 기존 wire 페이로드 / snapshot 호환.
- `kebab-llm-local::OllamaLanguageModel` 가 req.images 를 Ollama
  `images: [base64, ...]` 와이어 필드로 라우팅.
  `#[serde(skip_serializing_if = is_empty)]` 로 비어 있을 때 wire
  shape 가 pre-P6-3 와 byte-identical.

## kebab-config

- 신규 `ImageCfg.caption: CaptionCfg`:
  - `enabled: bool` (default false)
  - `max_pixels: u32` (default 768, 클램프 [128, 1536])
  - `prompt_template_version: String` (default \"caption-v1\")
- `KEBAB_IMAGE_CAPTION_{ENABLED,MAX_PIXELS,PROMPT_TEMPLATE_VERSION}`
  3종 환경변수 추가.

## Spec deviations

`tasks/HOTFIXES.md` 2026-05-02 항목 추가:
- Symptom 1: spec p6-3 시그니처가 `&dyn LanguageModel` 인데 frozen
  trait + GenerateRequest 가 vision 미지원. → trait 확장.
- Symptom 2: spec 의 cargo feature `caption` (default OFF at compile
  time) → runtime gate 1개로 통합. base64/image/kebab-llm 외 추가
  deps 없어 cargo feature 의 binary 절감 가치 미미.

p4-1 / p4-2 / p6-3 spec 의 amends 명시.

## 테스트

`cargo test -p kebab-parse-image --test caption` — 9건 + 1 ignored:
- feature gate (disabled → no-op / Err on direct call)
- happy path (block.caption Some + Provenance CaptionApplied)
- 빈 토큰 stream → empty text + caption.is_some()
- CapturingMock 으로 req.images 라우팅 검증 (base64 1개, decode 가능)
- 한국어 / 영어 프롬프트 분기 (CapturingMock 의 system 캡처)
- LM Err → block.caption None 유지 + events 미기록
- 결정성 (동일 mock 입력 → 동일 caption)
- max_pixels 클램프 (99999 → 1536, 4000×3000 PNG 다운스케일 검증)
- opt-in 통합 (실 192.168.0.47 Ollama / gemma4:e4b → \"The image is
  a solid red color.\" 검증 완료, 4.3초)

`cargo test --workspace --no-fail-fast -j 1` 전체 pass.
`cargo clippy --workspace --all-targets -- -D warnings` pass.

## 의존성 경계

- 추가 deps: `kebab-llm` (trait 만), `base64` (이미 P6-2 에서 추가).
- dev-deps: `kebab-llm/mock` 으로 `MockLanguageModel`,
  `kebab-llm-local` (통합 테스트 전용 — 런타임 deps 에는 없음).
- forbidden 침범 없음: `kebab-source-fs / parse-md / normalize /
  chunk / store-* / embed* / search / rag / UI` 미참조.

contract: docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
sections: §3.4 ImageRefBlock.caption, §3.7a ModelCaption, §9.1
caption (model-generated, low trust).
2026-05-02 06:05:39 +00:00
1539367692 review(p6-2): 회차 3 cosmetic — build() 회귀 테스트 + lib doc trust note
- src/ocr.rs:
  • `OllamaVisionOcr` 에 `#[derive(Debug)]` 추가 (test 의 expect_err
    바운드 충족용; reqwest::blocking::Client 도 Debug 구현).
  • 신규 unit 테스트 3건 (`build_rejects_empty_endpoint`,
    `build_rejects_empty_model_after_trim`,
    `build_clamps_max_pixels_outside_legal_range`) — 회차 2 에서
    추가된 `fn build` 가드의 회귀 신호.
- src/lib.rs:
  • 모듈-레벨 doc-comment 에 OCR 트러스트 정책 한 줄 추가
    (\"LLM-driven default can hallucinate; OcrText.engine carries
    source identity\"). lib 사용자가 ocr 모듈 doc 까지 안 들어가도
    의도 캐치 가능.

cargo test -p kebab-parse-image — 31 pass + 1 ignored
  (11 unit + 12 P6-1 integration + 8 P6-2 integration).
cargo clippy -p kebab-parse-image --all-targets -- -D warnings — pass.
2026-05-02 05:51:00 +00:00
2bede0030f review(p6-2): 회차 2 지적 반영
- src/ocr.rs:
  • `OllamaVisionOcr::new` 와 `from_parts` 의 입력 검증을 공통
    `fn build` 으로 통합. 두 생성자가 빈 endpoint / 빈 model /
    `max_pixels` 클램프 동일 invariant 를 공유 — \"테스트는 통과하지만
    프로덕션은 panic\" 분기 차단.
  • `max_pixels` clamp 가 실제로 발동 시 `tracing::warn!` 로 사유
    기록 (사용자가 \"왜 항상 4096?\" 디버깅 가능).
  • `downscale_to_long_edge` 의 long-axis 가 `f32` 라운딩으로 1px
    초과하는 코너 케이스 (예: max=1601, long=4001) 후행 클램프로
    엄격히 묶음. doc-comment 의 \"long edge is at most max_long_edge\"
    가 실제 동작과 정확히 일치.
- tests/ocr.rs:
  • 통합 테스트의 이중 게이트 (`#[ignore]` + `KEBAB_OCR_INTEGRATION=1`)
    제거. `--ignored` 만으로 실행 의도 단일 신호화 — `kebab-llm-local`
    의 통합 테스트 컨벤션과 일관됨. endpoint / model 의 env 오버라이드는
    유지.

cargo test -p kebab-parse-image — 28 pass + 1 ignored.
cargo test -p kebab-config — 21 pass.
cargo clippy --workspace --all-targets -- -D warnings — pass.
2026-05-02 05:48:23 +00:00
e869710d82 review(p6-2): 회차 1 지적 반영
- crates/kebab-config/src/lib.rs:
  • `OcrCfg.endpoint: String` (\"\" sentinel) → `Option<String>` 으로 교체.
    `#[serde(default)]` 적용. `KEBAB_IMAGE_OCR_ENDPOINT=\"\"` (빈 값) 도
    None 으로 매핑하는 분기 추가.
  • 신규 회귀 테스트 `image_ocr_endpoint_empty_env_value_is_none`.
- crates/kebab-parse-image/src/ocr.rs:
  • `OllamaVisionOcr::new` 의 endpoint fallback 로직을 새 `Option<String>`
    스키마에 맞춰 정리 (`as_deref` + match).
  • `OllamaGenerateResponse` 의 dead `_other: HashMap<String, Value>` 필드
    제거. `serde_json::Value` import 도 같이 정리.
  • `OllamaGenerateRequest.images: Vec<&'a str>` → `[&'a str; 1]`
    (호출당 vec! 알로케이션 제거, multi-image 는 OcrEngine trait 가
    단일 이미지를 받으므로 OOS).
  • `downscale_to_long_edge` 단일-디코드로 리팩터. PNG passthrough
    hot path 보존 (header sniff 만으로 분기), 그 외 모든 경로는
    decode 1회 + (필요 시) resize + PNG re-encode 1회로 통일.
  • `pub fn max_pixels(&self) -> u32` accessor 추가 — clamp 결과
    검증 용 (단순 inspector).
- crates/kebab-parse-image/tests/ocr.rs:
  • `cfg_for_endpoint` / 통합 테스트가 `Some(endpoint)` 형태로 갱신.
  • `from_parts_clamps_max_pixels_into_legal_range` 가 새 accessor
    로 실제 클램프 결과 (256 / 4096 / 1024) 를 검증하도록 강화.
  • 통합 테스트가 폰트 부재 시 panic 대신 skip 하도록 분기.
- crates/kebab-parse-image/tests/common/mod.rs:
  • `hello_world_png` 가 `anyhow::Result<Vec<u8>>` 반환하도록 변경.
    expect(\"DejaVu Sans Bold required\") 메시지를 \"only the opt-in
    OCR integration fixture needs this font\" 로 의도 명확화.

cargo test -p kebab-parse-image — 28 pass + 1 ignored.
cargo test -p kebab-config — 21 pass (+1 회귀).
cargo clippy --workspace --all-targets -- -D warnings — pass.

Reviewer-suggested workspace.dependencies 통합 (reqwest / base64) 은
P6-3 와 함께 처리할 수 있도록 follow-up 으로 두고 본 PR scope 에서
제외 (회차 1 본문에서 명시).
2026-05-02 05:45:25 +00:00
4ed5536c92 feat(kebab-parse-image): P6-2 OCR adapter — Ollama-vision default
- 새 모듈 `crates/kebab-parse-image/src/ocr.rs` 추가. spec 의 `OcrEngine`
  trait 그대로 + `OllamaVisionOcr` default 구현 + `apply_ocr` 헬퍼.
- `OllamaVisionOcr`: `<endpoint>/api/generate` 비스트리밍 호출,
  `images: [base64]` 필드로 이미지 전달, 프롬프트는 언어 힌트
  + 화이트리스트 언어 목록 포함. 응답 prose 를 `OcrText.joined` 로,
  prepared image 전체 영역 단일 region (confidence 1.0) 으로 wrap.
  기본 모델 `gemma4:e4b`. endpoint 비어 있으면 `models.llm.endpoint`
  로 fallback.
- 이미지 전처리: long-edge `config.image.ocr.max_pixels` (기본 1600,
  256~4096 클램프) 초과 시 PNG 로 재인코딩 (image::imageops::resize,
  Triangle filter). PNG 입력이 max 이내면 zero-copy passthrough.
- `apply_ocr` 는 OCR 성공 시 block.ocr 를 Some 으로 채우고
  ProvenanceKind::OcrApplied 이벤트 추가. 실패 시 block.ocr 는
  None 그대로 + provenance 미기록 (부분 상태 누출 금지).
- `kebab-config`: 새 `ImageCfg.ocr: OcrCfg` 블록 (enabled/engine/model
  /endpoint/languages/max_pixels). `#[serde(default)]` 로 pre-P6
  TOML 호환. `KEBAB_IMAGE_OCR_*` 환경변수 5종 추가.

## Spec deviation

원래 P6-2 spec 은 Tesseract 를 default OCR 엔진으로 지정했으나, dev /
CI 호스트에서 `libtesseract-dev` 시스템 패키지 설치를 피하려고
Ollama-vision 으로 default 를 교체. `OcrEngine` trait 추상화는 spec
그대로 보존 — Tesseract / Apple Vision / PaddleOCR 어댑터는 같은
trait 으로 추후 feature-gate 추가 가능. 자세한 내역은
`tasks/HOTFIXES.md` 2026-05-02 항목 참조.

Trust 측면: vision LM 은 hallucinate 가능. `OcrText.engine = "ollama-vision"`
필드로 consumer 가 엔진 별 신뢰 분기 가능.

## 테스트

- 신규 (`tests/ocr.rs`, 8 + 1 ignored):
  - 200 happy → OcrText 디코딩 (joined / engine / engine_version /
    region count / bbox / confidence)
  - 빈 응답 → 빈 regions
  - 5xx → Err with status + body 포함
  - 200 error envelope → Err
  - apply_ocr → block.ocr Some + Provenance OcrApplied 1건
  - apply_ocr error → block.ocr None 유지 + events 미기록
  - 4000×3000 PNG → max_pixels=1024 까지 다운스케일, aspect ratio 보존
  - from_parts max_pixels 클램프
  - opt-in `KEBAB_OCR_INTEGRATION=1` 통합 (실제 192.168.0.47 Ollama
    `gemma4:e4b` 로 \"Hello World 2026\" 전사 검증 완료)
- 신규 (`src/ocr.rs` unit): truncate, build_prompt 언어/힌트 처리
- `kebab-config` 테스트 +3: defaults, env override, pre-P6 TOML 호환

전체: `cargo test -p kebab-parse-image` 28 pass + 1 ignored,
`cargo test -p kebab-config` 20 pass,
`cargo clippy --workspace --all-targets -- -D warnings` pass.

contract: docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
sections: §3.4 ImageRefBlock.ocr, §3.7a OcrText / OcrRegion, §9.1 OCR
vs caption provenance.
2026-05-02 05:38:24 +00:00
a4f895e8cc review(p6-1): 회차 3 cosmetic 정리
- src/dims.rs: `with_guessed_format()` 의 `map_err(...)` 를 `.context()?`
  로 정리. 회차 2 의 `match Some/None` → `.context()?` 정리와 호출
  스타일 통일.
- src/lib.rs: `(*format).to_string()` → `format.to_string()`. `format` 이
  `&&'static str` 이라 명시 deref 없이 자동 호출 가능.
- tests/common/mod.rs: `ImageFixture::workspace_root` / `config` 가시성을
  `pub` → 모듈-비공개로 축소. 외부 호출자가 두 필드를 직접 읽지 않고
  `ctx()` 만 사용함.

cargo test -p kebab-parse-image — 16건 pass.
cargo clippy -p kebab-parse-image --all-targets -- -D warnings — pass.
2026-05-02 05:19:13 +00:00
58d56467e5 review(p6-1): 회차 2 지적 반영 — GPS 안전성 + 디버깅
- src/exif_extract.rs:
  • `gps_decimal` 에 ±90 / ±180 범위 검증 추가. 비정상 EXIF (예: 위도
    300°) 가 들어와도 wire 에 흘러나가지 않고 silent drop.
  • GPSLatitudeRef / GPSLongitudeRef 가 빠진 좌표는 양수 가정으로
    내보내지 않고 None 반환 — 모호한 부호를 그대로 두는 대신 손상된
    메타데이터로 처리.
  • `read_from_container` 실패 시 `tracing::debug!` 한 줄로 사유 기록
    (운영시 \"EXIF 없음\" vs \"EXIF 손상\" 구분 단서).
- src/dims.rs: `match Some/None` 을 `anyhow::Context::context()?` 로
  압축. import 한 줄 추가.
- src/lib.rs: `Vec::with_capacity` 를 dim_warning 분기에 따라
  `2` / `3` 으로 정확히 맞추고 의미 주석 한 줄 추가.
- tests/common/mod.rs: `build_exif_blob_gps` 를 `GpsFlavor`
  파라미터로 일반화 (`Valid` / `NoRef` / `OutOfRange`). JPEG 스플라이스
  로직은 `splice_exif_into_jpeg` 헬퍼로 추출.
- tests/extractor.rs: 회귀 테스트 2건 추가 — `*Ref` 누락 좌표 드롭,
  out-of-range 위도 드롭 (경도는 정상 통과 검증).

cargo test -p kebab-parse-image — 16건 (4 unit + 12 integration) pass.
cargo clippy -p kebab-parse-image --all-targets -- -D warnings — pass.
2026-05-02 05:16:37 +00:00
194dd34668 review(p6-1): 회차 1 지적 반영
- Cargo.toml: 미사용 deps 제거 (`serde`, `thiserror`) + dev-deps 의
  `serde_json` 중복 선언 제거.
- src/lib.rs: 변수명 `decode_warning` → `dim_warning` (16k cap 초과
  분기까지 포괄하므로 더 정확).
- src/exif_extract.rs: `ascii_field` / `u32_field` 의 dead-flexibility
  `In` 인자 제거 (모든 호출이 `In::PRIMARY` 였음). 두 단 `if let` 을
  Rust 2024 let-chain 으로 정리. EXIF 화이트리스트 출력 키를
  workspace wire-schema 컨벤션에 맞춰 snake_case 로 통일
  (`Make` → `make`, `DateTimeOriginal` → `date_time_original` 등).
- tests/common/mod.rs: 호출되지 않는 `fake_path` 헬퍼 + `Path` import
  제거.
- tests/extractor.rs: snake_case 키로 assertion 갱신.

cargo test -p kebab-parse-image — 14건 모두 pass.
cargo clippy -p kebab-parse-image --all-targets -- -D warnings — pass.
2026-05-02 05:11:40 +00:00
d11a810119 feat(kebab-parse-image): P6-1 image extractor + EXIF whitelist
- 새 crate kebab-parse-image 추가 (workspace 19개째). MediaType::Image(_)
  자산을 단일-블록 CanonicalDocument 로 변환하는 ImageExtractor 구현.
- parser_version "image-meta-v1" (§9 versioning).
- 본문은 Block::ImageRef 1건만 포함 — OCR / caption 필드는 None 으로
  남겨 두고 P6-2 / P6-3 에서 채운다.
- EXIF 화이트리스트 (§9.1, PII 표면 최소화):
  Make / Model / Software / DateTimeOriginal / Orientation /
  GPSLatitude(+Ref) / GPSLongitude(+Ref). MakerNote / Thumbnail / 기타
  태그는 폐기. DateTime 은 EXIF "YYYY:MM:DD HH:MM:SS" → ISO-8601 변환.
  GPS DMS triple + N/S/E/W ref → signed decimal degree.
- 차원: image::ImageReader 헤더만 읽어 (w, h, format) 획득. 16k×16k cap
  초과 또는 디코드 실패 → metadata.user.dimensions = null + Provenance
  Warning 이벤트 (Err 아님). 포맷 자체 인식 실패 → anyhow::Error
  (caller skip).
- SourceSpan::Region { 0, 0, w, h } 으로 전체 이미지 영역 표기. 결정성:
  동일 bytes + 동일 parser_version → 동일 doc_id + block_id (§4.2 ID
  recipe 그대로 사용).
- metadata.source_type = Reference, trust_level = Primary, lang = "und".
  title = 확장자 제외 파일명, alt = 파일명.
- 의존성 경계 (§8): kebab-core 만 + image 0.25 (default features off,
  png/jpeg/webp/gif/tiff 만), kamadak-exif 0.6, anyhow / serde /
  serde_json / time / tracing / thiserror. kebab-source-fs · parse-md ·
  store-* · embed* · llm* · rag · UI crate 미참조.
- 테스트 14개 (4 unit + 10 integration):
  • PNG 차원 추출, JPEG EXIF GPS 추출 (DMS → decimal 변환 정확도 1e-6),
    EXIF 없는 PNG → 빈 map, 손상 PNG → warning + null dims (panic 없음),
    인식 불가 bytes → Err, 결정성, 스냅샷, supports() 매칭, media_type
    불일치 거부.
  • 픽스처는 in-memory 생성 (PNG 는 image crate, EXIF JPEG 는 kamadak
    Writer 로 EXIF blob 만든 뒤 SOI 직후 APP1 splice) — 바이너리
    fixture 커밋 없음.
- HEIC / RAW 는 spec 상 v1 out of scope (image crate 미지원, Apple
  Vision sidecar 가 추후 P+ 에서 채움).
- tasks/p6/p6-1-image-extractor-exif.md status: planned → completed.

contract: docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
sections: §3.4 Block::ImageRef + ImageRefBlock, §3.7a OcrText /
ModelCaption stubs, §9.1 image extraction policy, §9 versioning.
2026-05-02 05:05:47 +00:00
f1a448d6dc refactor(rename): kb → kebab — binary, env vars, XDG paths, file renames
두 번째 commit. 사용자 facing surface (CLI binary, env vars, XDG paths)
+ 코드 안 single-letter token (`KB_`, `kb.sqlite`, `/kb/`, tracing
target) 일괄 rename. 그리고 3 개 file rename:

- 디자인 doc `2026-04-27-kb-final-form-design.md` →
  `2026-04-27-kebab-final-form-design.md`
- 최초 보고서 `kb_local_rust_report.md` → `kebab_local_rust_report.md`
- workspace ignore `.kbignore` → `.kebabignore`

## 변경

- `crates/kebab-cli/Cargo.toml`: `[[bin]] name = "kb"` → `"kebab"`.
- `crates/kebab-cli/src/main.rs`: `#[command(name = "kb", …)]` →
  `name = "kebab"`.
- 모든 `KB_*` env var (코드 + doc + 테스트) → `KEBAB_*`. apply_env
  prefix 매칭 + 30+ 개 setting 키 모두.
- XDG paths: `~/.config/kb` / `~/.local/share/kb` / `~/.cache/kb` /
  `~/.local/state/kb` → `~/.config/kebab` 등. config defaults +
  expand_path tests + paths.rs 의 hardcode 모두.
- SQLite filename: `kb.sqlite` → `kebab.sqlite` (`SQLITE_FILE` const
  + 테스트 hardcode 모두).
- tracing target: `target: "kb-*"` → `"kebab-*"` (10+ 곳).
- snapshot fixture: `.kbignore` → `.kebabignore` (`fixtures/source-fs/
  tree-1.snapshot.json` 갱신).

## 검증

- `cargo test --workspace -j 1` clean (linker OOM 회피 위해 직렬).
- `cargo clippy --workspace --all-targets -- -D warnings` clean.

다음 commit 에서 docs sweep.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 04:01:35 +00:00
911fb49550 refactor(rename): kb crates → kebab — Cargo packages, folders, Rust modules
프로젝트 이름 `kb` → `kebab` rename 의 첫 단계.

- workspace `Cargo.toml`: members `crates/kb-*` → `crates/kebab-*`,
  repository URL `altair823/kb` → `altair823/kebab`.
- 18 crate 폴더 rename via `git mv` (history 보존).
- 각 crate `Cargo.toml`: `name = "kb-*"` → `"kebab-*"`, path deps
  `../kb-*` → `../kebab-*`.
- 모든 `.rs`: `kb_<id>` snake-case 모듈 path 18 개 (`kb_core`,
  `kb_config`, `kb_app`, `kb_cli`, `kb_eval`, `kb_search`, `kb_chunk`,
  `kb_normalize`, `kb_source_fs`, `kb_parse_md`, `kb_parse_types`,
  `kb_store_sqlite`, `kb_store_vector`, `kb_embed`, `kb_embed_local`,
  `kb_llm`, `kb_llm_local`, `kb_rag`) → `kebab_<id>` 일괄 sed (단어
  경계 \\b 사용해 영어 문장 안의 "kb" 약어 미오염).

CLI binary 이름 (`[[bin]] name = "kb"`), 환경변수 `KB_*`, XDG paths,
tracing target, 그리고 docs sweep 은 다음 commit 에서.

## 검증

- `cargo check --workspace` clean — 모든 crate 빌드 통과 후 commit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 03:28:08 +00:00
ee1f2339dd fix(p5-2): apply push-time review items — citation/refusal correctness + nits
두 reviewer 의 should-fix 4 건 + nit 5 건 push 전 반영.

## should-fix

- `citation_coverage`: 빈 citations[] 가 `Iterator::all` vacuous-true 로
  1.0 새는 거 차단 — `!is_empty() && all(non-empty path)` 로 변경.
  또한 `_store: &SqliteStore` dead 인자 시그니처에서 제거 (호출 사이트
  + 테스트 helper 정리).
- `refusal_correctness`: lexical-only run 에서 `answer == None` 인 경우
  분모 증가 안 함 (NaN/null 출력) — 자동 fail 처리하던 게 metric 의미를
  왜곡함. 새 unit test `refusal_correctness_nan_for_non_rag_run` 추가.
- `groundedness`: `must_contain.is_empty() && forbidden.is_empty()`인
  golden 은 분모에서 제외. unconfigured entry 가 free 1.0 받지 않게.
  새 unit test `groundedness_skips_unconfigured_goldens` 추가.
- `kb-cli/Cargo.toml` rationale 코멘트 사실 오류 정정 — kb-eval →
  kb-app 의존이지 그 반대 아님.

## nits

- `KB_EVAL_GOLDEN` / `DEFAULT_GOLDEN_PATH` 중복 — `metrics::` 의
  `pub(crate)` 로 단일화, `runner` 가 import.
- `render_report_md` 의 `{:?}` `ComparisonKind` → 명시적 lowercase
  매핑 함수 (`win`/`loss`/`draw`/`regression`) — JSON 직렬화 컨벤션과
  통일.
- `extract_chunker_version` `None == None` 매치 silent 위험에 대한
  defensive 코멘트.
- `delta_null_when_either_nan` 테스트의 `let mut` suppress hack →
  struct update syntax 로 정리.
- `empty_store` test helper + 매번 `mem::forget(tmp)` 죽은 코드 제거.

## 추가 spec doc

`tasks/p5/p5-2-metrics-compare.md` deviations 섹션 4 항목 추가:

- `kb-eval` crate-level `kb-app` dep — P5-1 inheritance, 새 모듈 surface
  는 import 안 함.
- `citation_coverage` 약화된 resolver — `document_exists_by_path` 기다리는
  중.
- `refusal_correctness` non-RAG 런 NaN.
- `groundedness` no-check golden skip.

## 검증

- `cargo test -p kb-eval` 35/35 (18 unit + 2 loader + 8 integration + 7
  runner; 새 3 unit test).
- `cargo clippy --workspace --all-targets -- -D warnings` clean.
- `compare_report_snapshot_matches_fixture` 변경 없이 통과 — 새 동작이
  스냅샷 입력 (lexical-only, no must_contain, no should-refuse) 영향 없음.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 03:17:32 +00:00
d9a5b88d27 feat(p5-2): kb-eval metrics + compare — AggregateMetrics, CompareReport, kb eval CLI
P5-2 구현. 저장된 eval_runs / eval_query_results 위에서:

- `kb_eval::metrics`: hit@k / MRR / recall@k_doc / citation_coverage /
  groundedness / empty_result_rate / refusal_correctness 계산. NaN
  metrics (분모 0)는 JSON null. 4-decimal round + Deserialize 추가로
  aggregate_json 라운드트립.
- `kb_eval::compare`: 두 run 비교 → CompareReport (per-metric Δ + per-
  query Win/Loss/Draw/Regression). chunker_version drift 시 graceful
  doc-id fallback (chunker_version_match: "fallback_doc"), `strict`
  옵션이면 refuse.
- `render_report_md`: 인간용 Markdown (집계 + Wins/Losses/Regressions
  표).
- `SqliteStore::{load_eval_run, load_eval_query_results,
  update_eval_run_aggregate}` + owned `EvalRunRecord` /
  `EvalQueryResultRecord` 추가 — write 측 borrow-shape는 그대로.
- `kb eval` CLI: `run` (P5-1 위임), `aggregate <id>`, `compare <a> <b>
  [--strict-chunker-version] [--write-report]`. `--json` 으로 raw
  CompareReport, 기본은 Markdown 출력.

## Spec deviations (intentional, doc 명시)

- Graceful 매칭은 doc-id-only (chunker_version_match: "fallback_doc")
  — 50% span overlap은 chunker re-index 후 양쪽 chunks 동시 보존이
  현실적으로 안 돼서 P6+ 로 deferred.
- `*_with_config` 헬퍼 추가: 통합 테스트가 TempDir Config 로 드라이브.
  no-arg 형태는 Config::load(None) 로 위임.
- CLI 는 kb-cli → kb-eval 직접 wire (kb-app cycle 회피). DoD 의 "via
  kb-app" 의도는 facade 단일화였지만 cycle 발생.
- `AggregateMetrics: Deserialize` 추가 — aggregate_json 라운드트립.

## 검증

- `cargo test -p kb-eval` 30/30 (15 unit + 2 loader + 8 metrics+compare
  통합 + 7 runner). 8 통합 중 snapshot 1 건 (`compare-1.json`).
- `cargo test -p kb-store-sqlite` 33/33.
- `cargo clippy --workspace --all-targets -- -D warnings` clean.
- forbidden imports 부재 (`kb-source-fs|kb-parse|kb-normalize|kb-chunk|
  kb-store-vector|kb-embed|kb-search|kb-llm|kb-rag|kb-tui|kb-desktop|
  kb-app` — kb-app 는 metrics/compare 모듈에 부재; runner 만 사용).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 03:05:13 +00:00
e6ff9c412c fix(p5-1): apply deferred review items — App reuse + expand_path hoist + nits
- kb-app: promote App to pub, add open_with_config / search / ask methods
  so kb-eval (and future TUI) can amortize embedder + vector store + LLM
  cold-start across many queries on one App instance. Memoization is
  per-instance via OnceLock; *_with_config free functions delegate.
- kb-config: add canonical expand_path helper + 8 unit tests; drop the
  4 duplicate copies in kb-store-sqlite, kb-store-vector, kb-embed-local,
  kb-eval (net: -6 duplicate tests, +8 canonical tests).
- kb-eval: extract elapsed_ms_u32 helper, drop redundant tracing debug
  log (with_context already names path on error), replace dead-port :1
  test with bind-then-release ephemeral port.

Verified: cargo clippy --workspace --all-targets -D warnings clean,
all crate tests green (kb-app 12+3 ignored, kb-eval 11, kb-config 17,
kb-store-sqlite 33, kb-store-vector 7+8 AVX-gated, kb-embed-local 7+7).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 18:55:23 +00:00
58a11cc2b8 feat(p5-1): kb-eval crate — golden-fixture runner + eval persistence
- new kb-eval crate: load_golden_set (YAML) + run_eval (per-query search/ask + persistence)
- new kb-store-sqlite::eval module: record_eval_run_with_results (transactional), document_exists / chunk_exists probes
- fixtures/golden_queries.yaml: 5-entry KO+EN template
- tests: 13 pass (loader: parse, dup-id, missing chunk_id; runner: elapsed, snapshot, error capture, JSONL, determinism, persistence, config_snapshot)
- per_query.jsonl mirror written to runs_dir/<run_id>/
- temperature=0 + fixed seed → byte-identical per_query.jsonl (lexical)

deviations from spec (documented in code):
- run_id uses uuid::Uuid::now_v7().simple() (timestamp-ordered hex) instead of ULID — uuid already in workspace deps
- load_golden_set_validated kept #[cfg(test)] pub(crate) — production inlines validate_against_db
- snapshot fixture uses normalized projection (id/query/mode/first_hit) — full byte-determinism covered by separate test
- index_version in config_snapshot left null (composed per call by kb-app, not config-level)

deferred to follow-up:
- App reuse across queries (currently rebuilds App per query)
- expand_path hoist to kb-config (3 crate clones now)
- --max-queries flag (deferred to P5-2 per updated spec)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 18:01:09 +00:00
9dde01eb9f fix(rag): normalize RRF fusion_score to [0,1] + log post-merge hotfixes
## Bug

config.rag.score_gate default 0.05 was incompatible with hybrid RRF
fusion_score: raw RRF tops out at num_retrievers / (k_rrf + 1) ≈
0.0328 at the default k_rrf=60, so every hybrid `kb ask` tripped
ScoreGate refusal even when the top hit was perfectly aligned across
both retrievers. Symptomatic on the post-P4-3 manual smoke at
/tmp/kb-smoke/ pointed at 192.168.0.47 Ollama:

    $ kb ask "Rust ownership 모델의 핵심 규칙은 뭐야?" --mode hybrid
    근거 부족. KB에 해당 내용 없음.        # top fusion_score = 0.0164

Per-mode score_gate (lexical_score_gate / vector_score_gate /
hybrid_score_gate) was rejected because it forces every consumer
(CLI, eval, TUI) to know which mode picks which threshold. Score
normalization solves it at the source.

## Fix

crates/kb-search/src/hybrid.rs divides every fused score by
2 / (k_rrf + 1), the theoretical RRF maximum with two retrievers
each contributing rank 1. After normalization:

- both retrievers agree on rank 1 → fusion_score = 1.0
- only one retriever finds the chunk → caps near 0.5
- typical mixed ranks → falls between 0 and 0.5

RRF's rank-ordering invariants are preserved (every score divides
by the same positive constant), so sort + tiebreak behaviour is
unchanged. Wire schema label `fusion_score` keeps its slot in
RetrievalDetail; only the magnitude shifts, and only for hybrid
mode (lexical / vector were already in [0, 1]).

Verification: re-ran the four-scenario smoke at /tmp/kb-smoke/ with
default score_gate = 0.05 — all four (Korean→Korean, English→
English, cross-language Korean↔English, out-of-corpus) succeed
with the expected grounded / refusal classification, top
fusion_score now ≈ 0.5.

## Tests

One unit test (rrf_formula_matches_known_value) updated to expect
the normalized value `(1/61 + 1/62) / (2/61) ≈ 0.9919` instead of
the raw `1/61 + 1/62 ≈ 0.0325`. The integration snapshot fixture
crates/kb-search/tests/fixtures/search/hybrid/run-1.json already
used presence checks (fusion_score_positive: true) rather than
absolute values, so it doesn't need regeneration. Workspace 319
tests pass; clippy clean across both feature configs.

## Docs

This commit also adds tasks/HOTFIXES.md as a dated post-merge log
covering this fix and the two earlier --config-flag regressions
(P3-5 hotfix #20 across ingest/search/list/inspect/doctor; P4-3
follow-up #24 for kb ask). Original task specs in tasks/p<N>/
*.md stay frozen as the historical contract; HOTFIXES.md is the
live source of truth for post-merge deltas. Each affected task
spec gets a "Risks/notes" addendum pointing back to HOTFIXES.md
so a reader landing on the spec sees the active behaviour:

- tasks/INDEX.md gains a "Post-merge 핫픽스" section.
- tasks/phase-3-vector-hybrid.md updates the RRF formula text to
  show the normalized form.
- tasks/p3/p3-4-hybrid-fusion.md "Behavior contract" RRF bullet
  notes the normalization and reason.
- tasks/p3/p3-5-app-wiring.md "Risks/notes" notes the --config
  fix.
- tasks/p4/p4-3-rag-pipeline.md "Risks/notes" notes the kb-ask
  --config fix and the score_gate-RRF incompatibility (closed by
  the normalization in p3-4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:16:01 +00:00
ed8bf87c65 fix(cli): honor --config flag in kb ask (P4-3 follow-up)
The earlier P3-5 hotfix wired --config through ingest / search / list /
inspect / doctor by switching kb-cli to call the *_with_config
companions. P4-3 added the ask body but kb-cli's Cmd::Ask arm still
called bare kb_app::ask(query, opts) — same bug as before, ask
silently fell back to ~/.config/kb/config.toml regardless of what the
user passed.

Caught during the post-P4-3 smoke against /tmp/kb-smoke/ pointed at
192.168.0.47 Ollama with gemma4:26b: the answer's wire JSON reported
`model.id = "qwen2.5:14b-instruct"` (the user's XDG default) instead
of `gemma4:26b` from the explicit --config, plus the score_gate /
data_dir / model fields all reflected XDG defaults. After this fix
the same invocation correctly returns model.id=gemma4:26b,
embedding=multilingual-e5-small (from the smoke config), grounded=true
with `[#2]` citation pointing at rust/ownership.md.

Same minimal pattern as the P3-5 hotfix:
- Build the Config once via Config::load(cli.config.as_deref()).
- Call kb_app::ask_with_config(cfg, query, opts) instead of
  kb_app::ask(query, opts).

Workspace 319 tests pass; cargo clippy --workspace --all-targets --
-D warnings clean.

Smoke verified across four scenarios:
- Korean→Korean-body lookup: grounded with rust/ownership.md citation.
- English→Korean-body cross-language: grounded with arch/rag-
  architecture.md citation.
- Korean→English-body cross-language: grounded with arch/embeddings.md
  citation.
- Out-of-corpus query: LlmSelfJudge refusal with "근거가 부족하다."

Out of scope (filed for follow-up):
- config.rag.score_gate default 0.05 is incompatible with hybrid
  RRF scores. RRF top score is bounded by 2/k_rrf (≈0.033 at k_rrf
  =60), so the spec default trips ScoreGate on every hybrid query.
  Workaround: lower the gate to 0.005 in the user's config; long-
  term fix needs either per-mode gate config or RRF score
  normalization to [0,1]. Tracked separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:56:57 +00:00
e35b06d0d0 feat(p4-3): kb-rag crate — full RAG pipeline + kb-app::ask wired
P4 terminal task. Implements the user-facing payoff: retrieve →
score gate → pack → render → generate → cite-validate → persist.
After this commit, `kb ask` actually works against an Ollama
backend; the pipeline grounds the answer in retrieved chunks and
refuses cleanly when the gate trips or the model self-judges.

New crate kb-rag:
- pub struct RagPipeline { retriever, llm, docs, config } — all
  Arc<dyn Trait + Send + Sync> so the pipeline shares + Sync.
- pub fn ask(query, opts) -> Result<Answer> drives the nine-stage
  flow per spec §1.
- pub struct AskOpts { k, explain, mode, temperature, seed,
  stream_sink: Option<mpsc::Sender<String>> }. k acts as a floor
  over config.search.default_k so a low-k caller can't starve
  retrieval (documented in field doc).

Pipeline stages:
1. Retrieve via the injected dyn Retriever.
2. Score gate: empty hits → NoChunks refusal (no LLM call); top-1 <
   config.rag.score_gate → ScoreGate refusal (no LLM call) with
   top-3 candidates listed in the synthesized answer text.
3. Pack: budget = config.rag.max_context_tokens.saturating_sub
   (prompt overhead). Per-hit `[#n] doc=… heading=… span=…\n<text>`
   with deterministic enumeration. If every hit's chunk is
   unfetchable from the store (deleted between search and pack),
   fall back to NoChunks refusal with a tracing::warn rather than
   feeding an empty [근거] to the LLM.
4. Render rag-v1 prompt with the spec's verbatim Korean system
   string + `[질문]/[근거]` user template.
5. Generate via dyn LanguageModel. Single-thread token loop owns
   the iterator; tokens optionally forward to opts.stream_sink (a
   `mpsc::Sender<String>`). SendError silently dropped — caller
   cancellation never panics the pipeline. After Done the loop
   reads (acc, finish_reason, usage) in lockstep with no race.
   max_completion = llm.context_tokens().saturating_sub
   (used_for_input).max(64) — explicitly NOT capped by
   config.rag.max_context_tokens (that's the packing budget for
   [근거], not the LM completion ceiling).
6. Citation extract via STRICT regex `\[#(\d{1,3})\]` (compiled
   once via OnceLock). Loose forms `[1]`, `[ #1 ]`, `[#foo]`,
   `[#1234]`, `vec![1]` are all rejected to prevent prose
   false-positives.
7. Citation validate covers four cases:
   - unknown marker (e.g. `[#7]` when only 3 packed) →
     LlmSelfJudge refusal.
   - empty answer with hits → LlmSelfJudge.
   - non-empty + no marker + matches `근거 (가|이) 부족` regex →
     LlmSelfJudge (model self-refused with the canonical phrase;
     phrase match logged via tracing::debug for observability).
   - non-empty + no marker + no refusal phrase → LlmSelfJudge
     (silent ungrounded answers are still refusals).
   - non-empty + ≥1 valid marker → grounded = true.
8. Build Answer per kb_core::Answer shape:
   - citations: filter packed list to exactly the markers cited.
     Wire format `marker: Some("[1]")` (square-bracketed bare
     index) per design §2.3, distinct from the prompt-side
     `[#n]` grammar.
   - embedding ModelRef: read from config.models.embedding for
     Vector/Hybrid; None for Lexical. Documented deviation since
     the Retriever trait doesn't expose the embedder. For
     ScoreGate/NoChunks refusals on Vector/Hybrid the embedding
     model is still recorded — the vector retriever WAS consulted
     even when the gate tripped.
   - TraceId minted as `ret_<8-hex>` from blake3(query, top_score,
     model_id, ns).
   - retrieval AnswerRetrievalSummary populated.
   - usage from the final Done chunk; latency_ms wall-clock
     fallback when the LLM reports zero.
   - created_at OffsetDateTime::now_utc().
9. Persist via SqliteStore::put_answer (new inherent method on
   SqliteStore, not on the DocumentStore trait — answers aren't
   documents and adding to kb-core was forbidden). Always inserts,
   refusals included. packed_chunks_json is null unless
   opts.explain == true.

kb-store-sqlite extension:
- pub fn put_answer(&Answer, query, packed_chunks_json) ->
  Result<AnswerId>. Maps all 22 fields of the answers table per
  V001 schema in a single INSERT under a transaction.

kb-app::ask wired:
- bail!("not yet wired (P4-3)") replaced with a real body that
  builds the retriever per opts.mode (Lexical | Vector | Hybrid),
  instantiates OllamaLanguageModel from config, constructs
  RagPipeline, calls pipeline.ask. AskOpts moves to kb-rag and is
  re-exported via `pub use kb_rag::AskOpts` so kb-cli's
  `use kb_app::AskOpts` keeps working.
- kb-app/Cargo.toml gains kb-rag, kb-llm, kb-llm-local. P3-5's
  forbids on these are lifted by P4-3 spec — kb-app is the
  orchestrator and ask requires both the trait crate and the
  Ollama adapter.
- kb-cli/main.rs's AskOpts literal updated with stream_sink: None
  for the CLI path (TUI in P9 will plumb a real sink).

Tests (kb-rag: 18; kb-app: 1 ignored):
- 3 unit in src/pipeline.rs: marker regex strictness (rejects all
  loose forms with byte-equal expectations), Send+Sync compile
  check, embedding_ref_for behavior across modes.
- 15 integration in tests/pipeline.rs covering every spec test row
  + the new "all chunks unfetchable falls back to NoChunks" guard:
  empty-hits, score-gate, grounded happy path, unknown-marker,
  prose-`[1]` rejection, `vec![1]` rejection, refusal-phrase,
  packing-budget overflow, streaming-forwards-to-mpsc, dropped-
  receiver-no-panic, usage-from-final-Done, answers-row-inserted-
  for-each-refusal-kind, determinism temp=0 seed=0, Answer JSON
  shape, unfetchable-chunks-fall-back-to-no-chunks (the new
  M3 test).
- kb-app/tests/ask_smoke.rs: 1 #[ignore]'d real-Ollama smoke that
  drives the wired ask end-to-end against `localhost:11434`.

Workspace: 319 passed / 26 ignored / 0 failed. cargo clippy
--workspace --all-targets -- -D warnings clean.

Allowed deps respected (kb-core, kb-config, kb-search, kb-llm,
kb-store-sqlite, serde, serde_json, regex, time, tracing,
thiserror) plus forced waivers anyhow (Retriever / LanguageModel
trait return types) and blake3 (TraceId minting). Forbidden
(kb-source-fs, kb-parse-md, kb-normalize, kb-chunk, kb-store-
vector direct, kb-embed* direct, kb-llm-local direct, kb-tui,
kb-desktop) all absent from `cargo tree -p kb-rag` — concrete
adapters reach the pipeline only through trait objects.

Out of scope: reranker between retrieve and pack (P+), multi-turn
chat memory (P+), LLM-as-judge eval (P5 uses rule-based
must_contain), --json streaming (buffers per §0 Q5 hybrid).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:06:10 +00:00
3e38a9bcb4 feat(p4-2): kb-llm-local crate — Ollama HTTP adapter (reqwest::blocking)
First real LanguageModel implementation. Wraps Ollama's local HTTP
API at POST {endpoint}/api/generate with stream:true, parses the
NDJSON streaming response into TokenChunk events, and maps Ollama
error states to a thiserror-derived LlmError with actionable hints.
Synchronous trait surface; reqwest::blocking handles the HTTP I/O.

Public surface:
- pub struct OllamaLanguageModel
- pub fn new(config: &Config) -> Result<Self> — lazy connect; never
  hits the network. Spec line 96.
- pub enum LlmError { Unreachable, ModelNotPulled, Timeout, Stream,
  Malformed }. Lives in this crate per spec — kb-core / kb-llm stay
  free of error taxonomy.
- impl kb_core::LanguageModel via re-export from kb-llm.

Streaming:
- POST body shape per spec §11.2: model, prompt = system + "\n\n" +
  user, stream: true, options { temperature, seed, num_ctx, stop }.
- OllamaStream owns BufReader<reqwest::blocking::Response>, reads
  NDJSON lines via read_until(b'\n'), parses each as
  {response, done, done_reason?, prompt_eval_count?, eval_count?,
  total_duration?}. Token frame → TokenChunk::Token; done frame →
  TokenChunk::Done { finish_reason, usage }.
- done_reason mapping: "length" → Length, "abort" → Aborted,
  "stop" / missing / unknown → Stop (forward-compat with future
  Ollama tags).
- Missing prompt_eval_count / eval_count default to 0 + tracing::warn
  (do NOT fail). Spec line 135.
- EOF without a done line synthesizes Done { Aborted, zeros } so
  downstream pipelines never deadlock waiting for a terminal frame.
- UTF-8: line-delimited framing means each JSON line is a complete
  UTF-8 sequence — no cross-HTTP-chunk codepoint splits to worry
  about. read_until accumulates whole lines regardless of how the
  underlying reqwest body chunks.

Error mapping (LlmError):
- reqwest::Error::is_connect() → Unreachable { endpoint, source }
  with hint "ensure `ollama serve` is running and reachable at
  <endpoint>".
- reqwest::Error::is_timeout() → Timeout.
- 200 with non-NDJSON first line (e.g., transparent-proxy HTML
  error page) → Stream(truncated body) — distinguished from
  Malformed by the iterator's has_emitted flag.
- 404 with body containing model_id (case-insensitive) OR English
  "model" + "not found" → ModelNotPulled(model_id) with hint
  "ollama pull <model_id>". Tightened beyond spec to survive
  Ollama localizing the error message (Korean / Japanese / etc.)
  while keeping the original English-substring fallback.
- Other 4xx/5xx → Stream(truncated body).
- Mid-stream JSON parse failure (after at least one valid line) →
  Malformed(line). Truncate all error bodies to 512 chars
  (chars-based, multibyte safe) so an nginx 500 page can't blow up
  the diagnostic.
- Trailing slash in endpoint stripped before formatting the URL —
  endpoint = "http://x:1234/" produces .../api/generate, not
  .../api//generate. Pinned by trailing-slash test.

Tokio note: reqwest 0.12's blocking feature internally wraps a
private current-thread tokio runtime, so cargo tree --edges normal
shows tokio. The auditable invariant is "no top-level tokio dep +
no async surface exposed to callers" — verified: src/ has zero
async/await/tokio::*. default-features = false drops default-tls
(rustls only) but does NOT drop tokio. Documented honestly in
Cargo.toml + lib.rs. Switching to ureq would remove tokio
entirely; deferred since reqwest is the spec's allowed dep.

Tests (24 total: 23 default + 1 ignored):
- 7 unit in src/ollama.rs: prompt-build, options-build, finish-
  reason mapping, truncate_body bounds (under_cap / over_cap_marker
  / multibyte_chars_not_bytes), 404+model-id heuristic.
- 3 in tests/construction.rs: ModelRef shape, context_tokens
  passthrough, lazy-connect proven via port-1 pointing.
- 13 in tests/streaming.rs: streamed tokens then Done, multibyte
  chars within a line round-trip (renamed from "split across
  chunks" to honestly reflect what's tested), Unreachable-with-
  hint, 4xx→Stream, 404→ModelNotPulled, concat-equals-canned,
  done_reason length / abort, missing eval counts default to zero,
  missing done_reason defaults to Stop, determinism-by-mock,
  trailing-slash endpoint, non-NDJSON 200 body → Stream not
  Malformed.
- 1 #[ignore] in tests/integration.rs: real Ollama on
  localhost:11434 with the configured model. Opt-in via
  cargo test -p kb-llm-local -- --ignored after `ollama serve`
  + `ollama pull`.

Workspace: 288 passed / 25 ignored / 0 failed. cargo clippy
--workspace --all-targets -- -D warnings clean. No native-tls,
no openssl in the dep graph.

Allowed deps respected: kb-core, kb-config, kb-llm, reqwest 0.12
(default-features=false; blocking, json, rustls-tls), serde,
serde_json, tracing, thiserror plus anyhow (forced by trait return
type). wiremock + tokio in [dev-dependencies] only.

Out of scope: llama.cpp / candle adapters (P+), Ollama embed
endpoint (separate adapter inside kb-embed-local if requested),
cancellation / abort tokens (P+), connection-pool tuning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:28:34 +00:00
27c669fbf9 feat(p4-1): kb-llm crate — LanguageModel trait re-export + MockLanguageModel
Establishes the kb-llm trait crate so concrete LLM adapters (p4-2
Ollama, future llama.cpp / candle) target a stable surface. Pure re-
export of kb_core::{LanguageModel, GenerateRequest, TokenChunk,
FinishReason, TokenUsage, ModelRef} plus a feature-gated deterministic
mock for downstream RAG tests (p4-3) that need an LLM trait object
without an Ollama dependency.

MockLanguageModel (cfg(feature = "mock"), default OFF):
- Holds canned_response + canned_finish + canned_usage + (model_id,
  provider, context_tokens). Pure in-memory; no I/O.
- generate_stream() honors GenerateRequest.stop: scans every non-empty
  stop string against the canned response, takes the earliest byte
  position (Iterator::min returns the first equal element on ties so
  declaration order in req.stop wins), truncates with a direct byte-
  slice (str::find returns a UTF-8 char boundary by contract).
- When a stop matches, finish_reason is overridden to Stop (matches
  OpenAI / Ollama real-world behaviour); otherwise the caller's
  canned_finish passes through verbatim.
- Emits one TokenChunk::Token per Unicode scalar value (char), NOT per
  grapheme cluster — Hangul jamo, emoji ZWJ sequences, combining
  marks split. Acceptable for trait-shape testing; real adapters MAY
  combine. Documented in module docs.
- Always terminates with TokenChunk::Done { finish_reason, usage } even
  if the canned response is empty. The returned iterator is a boxed
  Vec<TokenChunk>::into_iter().map(Ok), trivially Send.
- Real adapters MAY return Err from generate_stream itself (e.g.
  connection refused) before any chunk is yielded; the mock never does.
  Documented for the trait re-exporter consumer audience.

Helpers:
- assert_finish_chunk(chunks) — asserts the last chunk is a Done.
  Useful for proptests asserting trait contract over random inputs.

Tests:
- cargo test -p kb-llm (no features): 2 reexport / dyn-dispatch tests.
- cargo test -p kb-llm --features mock: 9 tests including 100-case
  proptest over random Unicode strings asserting Done terminator,
  char-count == streamed Token chunks, concat == canned (truncated by
  stop), plus explicit cases for stop-string truncation, first-stop-
  match precedence, model_ref dimensions=None invariant, finish reason
  pass-through.
- All 271 workspace tests pass; clippy clean for both default and
  mock-on feature configurations.

Symbol gating verified:
- cargo build --release -p kb-llm (default): nm shows zero
  MockLanguageModel symbols.
- cargo build --release -p kb-llm --features mock: three trait-impl
  symbols present. Spec invariant "release builds MUST NOT include
  MockLanguageModel" enforced at the symbol level.

Allowed deps respected: only kb-core (path) and anyhow (workspace,
forced by trait return type). Dropped kb-config / serde / thiserror /
tracing from the spec's allowed list — they are listed as Allowed but
nothing in this skeleton crate references them, and dropping them
keeps the dependency graph slim for downstream consumers. p4-2/p4-3
will add what they need at their own dep sites.

Forbidden deps (reqwest, ureq, tokio, whisper-rs, kb-source-fs,
kb-parse-md, kb-normalize, kb-chunk, kb-store-*, kb-embed*, kb-search,
kb-rag, kb-tui, kb-desktop) all absent from cargo tree -p kb-llm.

Out of scope: real adapter (p4-2 Ollama), token counting against the
real tokenizer, server-side cancellation / abort signals (P+).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:37:46 +00:00
a08f61a242 fix(cli): honor --config flag + improve search output legibility
Two issues surfaced during the post-P3-5 manual smoke test against a
six-document workspace:

1. --config flag was silently ignored. kb-cli read cli.config only
   while building SourceScope inside the Ingest arm, then called
   kb_app::ingest(scope, summary_only) which internally re-loads
   Config::load(None) — falling back to ~/.config/kb/config.toml
   regardless of what the user passed. Same pattern in search,
   list, inspect, doctor. Users had to rely on KB_* env vars to
   point at a non-default config.

2. Search output collapsed RRF hybrid scores to "0.02" because
   `{:.2}` truncated the (0, 0.033]-bounded fused score, and
   chunks from the same document showed up as identical lines
   ("3. 0.02  arch/rag-architecture.md") since heading_path was
   never printed.

Fix:

- kb-app: doctor/ingest/search/list/inspect already had
  *_with_config(Config, ...) seams introduced for integration tests
  (#[doc(hidden)] pub). Repurpose them as the official "config-explicit"
  API — kb-cli now builds the Config once via
  Config::load(cli.config.as_deref()) at the top of every subcommand
  and threads it into the *_with_config variant. Module doc-comment
  updated to reflect three callers (CLI --config, integration tests,
  TUI session) instead of "test-only seam".
- kb-app: doctor() rewritten as doctor_with_config_path(Option<&Path>)
  that respects an explicit path. config_loaded probe now reports the
  actual path checked, returning a clear hard error if --config points
  at a non-existent or malformed file (defaults would silently mask
  user intent). data_dir_writable resolves storage.data_dir from the
  loaded config (with env overrides applied via Config::apply_env) so
  --config users see their custom paths reflected. Original doctor()
  signature kept as a None-passing wrapper.
- kb-cli: ingest/search/list/inspect/doctor each call the
  *_with_config* companion. Search printer switches to {:.4} score
  formatting (RRF hybrid range bounded by ~2/k_rrf ≈ 0.033 at k_rrf=60
  default) and appends `> head1 / head2` when heading_path is non-
  empty so chunks from the same document are visually distinguishable.

Verified manually:
- `kb --config /tmp/kb-smoke/config.toml doctor` reports the
  custom config path + custom data_dir, not the XDG defaults.
- `kb --config /tmp/kb-smoke/config.toml search "..." --mode hybrid`
  returns hits with distinct 4-digit scores and heading paths
  ("rust/ownership.md > Rust 소유권 모델 / Borrow checker").

Workspace 269 passed / 24 ignored / 0 failed; cargo clippy
--workspace --all-targets -- -D warnings clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:46:37 +00:00
17d52461b2 feat(p3-5): wire kb-app facade — ingest / search / list / inspect
Replaces the P0 `bail!("not yet wired")` stubs in kb-app with real
bodies that compose the libraries shipped through P3-4. After this
commit, `cargo run -p kb-cli -- index` actually walks the workspace
and persists chunks (SQLite + optionally LanceDB), and
`cargo run -p kb-cli -- search --mode {lexical,vector,hybrid}` returns
real SearchHits with citations. `kb-app::ask` stays stubbed; P4-3
owns it.

App lifecycle (crates/kb-app/src/app.rs):
- Internal pub(crate) struct App holds the Config plus
  Arc<SqliteStore> eagerly, with embedder + LanceVectorStore behind
  OnceLock<Arc<...>> for memoization. First call pays the ~470MB
  fastembed init / Lance open; subsequent calls return the cached
  Arc::clone. OnceLock::set race losers fall back to get().cloned()
  so the lazy-init is concurrent-safe.
- One-shot CLI invocations pay the cost once at most. The P9 TUI
  (which holds an App for the session) gets memoization for free.

ingest pipeline (lib.rs):
- FsSourceConnector::scan(&scope) → per asset:
  parse_frontmatter → parse_blocks → build_canonical_document →
  MdHeadingV1Chunker.chunk → put_asset_with_bytes → put_document →
  put_blocks → put_chunks. One transaction per document per design
  §5.8 (kb-store-sqlite's put_* methods own the transactions).
- When provider != "none" and dimensions > 0: build embedder once,
  embed each doc's chunks as Document kind, ensure_table once at the
  top of the run, then upsert the VectorRecord batch. Lexical-only
  config (provider == "none") skips both — verified by
  ingest_provider_none_skips_lance test.
- Per-asset parse failures recorded as IngestItemKind::Error with
  the warning attached; the run continues. Only structural failures
  (DB unreachable etc.) abort.
- Aggregate counts (assets_scanned / new / updated / skipped /
  errors / chunks_indexed / embeddings_indexed / duration_ms) flow
  into both the JobRepo progress_json AND a dedicated ingest_runs
  row written via SqliteStore::record_ingest_run (new
  pub(crate) helper added to kb-store-sqlite — see below).
  summary_only=true writes items_json=NULL but still populates the
  count columns.

search dispatch:
- SearchMode::Lexical → LexicalRetriever directly.
- SearchMode::Vector → VectorRetriever with embedder + LanceVectorStore.
- SearchMode::Hybrid → HybridRetriever composing the two.
- Vector / Hybrid with provider=none returns a clear error naming the
  config key to flip ("models.embedding.provider").

list_docs / inspect_doc / inspect_chunk delegate straight to
DocumentStore trait methods. Returns Err with actionable message on
not-found.

Test seam: each public free function has a matching
#[doc(hidden)] pub fn *_with_config(cfg, ...) companion that
integration tests invoke directly (the public form internally calls
load_config()). pub(crate) would not reach across the integration-
tests crate boundary; #[doc(hidden)] keeps it out of rustdoc and the
function comment flags it as test-only.

kb-store-sqlite additions:
- pub struct IngestRunRow + pub fn record_ingest_run on SqliteStore
  for the kb-app aggregate-counts persistence path. Helper writes
  the ingest_runs row directly with all aggregate columns; jobs
  table still gets a JobRepo create/update_progress/finish trio in
  parallel.

Tests (11 default, 2 #[ignore] AVX-gated):
- ingest_lexical: round-trip, idempotent, summary_only_drops_items,
  provider_none_skips_lance (asserts no .lance dir on disk),
  records_ingest_runs_row_with_aggregate_counts, tags_any filter,
  inspect_doc_not_found, inspect_chunk_not_found.
- search_lexical: lexical hits with embedding_model=None,
  empty_query_returns_empty, vector_mode_with_provider_none returns
  clear error.
- search_vector: hybrid mode end-to-end (#[ignore], AVX), Vector
  mode embedding_model assertion (#[ignore], AVX). Both run on the
  AVX VM in ~21s combined (first run pays the model download).
- TestEnv pins workspace.root + storage.{data_dir,model_dir} to a
  TempDir so tests don't touch the user's $HOME/.local/share.
- Fixture workspace at crates/kb-app/tests/fixtures/workspace/ has
  three small markdown files with varied frontmatter (rust+cargo+
  python tags) so the tags_any filter test exercises a non-trivial
  predicate.

Workspace 269 passed / 24 ignored / 0 failed (was 261/22). cargo
clippy --workspace --all-targets -- -D warnings clean. CLI smoke
verified manually: `cargo run -p kb-cli -- index` returns a real
IngestReport JSON; `cargo run -p kb-cli -- search "..."` returns
hits with citations; `cargo run -p kb-cli -- list docs` lists the
indexed documents.

Allowed deps respected: kb-source-fs, kb-parse-md, kb-parse-types,
kb-normalize, kb-chunk, kb-store-sqlite, kb-search, kb-store-vector,
kb-embed, kb-embed-local plus existing tracing / anyhow / serde /
toml / dirs and now blake3 (run_id) + time. Forbidden (kb-llm*,
kb-rag, kb-tui, kb-desktop, kb-parse-{pdf,image,audio}) absent from
cargo tree -p kb-app.

Out of scope per spec: ask body (P4-3), --rebuild-fts wiring,
--resume checkpointing (P+), --watch (P+), TUI / desktop integration
(P9 consumes this facade).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:11:21 +00:00
ccd49ef546 feat(p3-4): hybrid-fusion — VectorRetriever + HybridRetriever (RRF)
Composes the existing LexicalRetriever (P2-2) with a new VectorRetriever
wrapper around LanceVectorStore (P3-3) into a single Retriever that
dispatches by SearchMode. For SearchMode::Hybrid, fuses lexical and
vector candidates via Reciprocal Rank Fusion and populates the full
RetrievalDetail per SearchHit so kb search --explain can attribute
scores back to each side.

Public surface (kb-search crate):
- pub struct VectorRetriever — Arc<dyn VectorStore + Send + Sync>,
  Arc<dyn Embedder>, Arc<SqliteStore>, IndexVersion at construction.
- pub struct HybridRetriever { lexical, vector, fusion, k }.
- pub enum FusionPolicy { Rrf { k_rrf: u32 } }.

VectorRetriever:
- Embeds query.text as EmbeddingKind::Query before delegating to
  VectorStore::search(query_vec, query.k * 2, &query.filters). Over-
  fetches by ×2 for filter losses; LanceVectorStore applies the
  filters internally so they propagate naturally.
- Hydrates each VectorHit into a full SearchHit by joining on
  chunk_id in a single IN-clause batch (no N+1): doc_path,
  section_label, chunker_version, source_spans for citation, plus
  embedding_model from embedder.model_id().
- Snippet trimmed to config.search.snippet_chars (vector mode lacks
  FTS5 highlighting; chunk text prefix is the next-best signal).
- Citation built from the chunk's first source span via the shared
  citation_helper module — extracted from lexical.rs so both
  retrievers compute citations identically (Byte/empty fallback to
  Line{1,1} preserved with tracing::warn).
- RetrievalDetail.method = Vector for standalone calls; both
  fusion_score and vector_score set to the LanceVectorStore-shifted
  cosine score; lexical_* None.

HybridRetriever:
- Lexical / Vector modes delegate 1:1 — no rebuild of RetrievalDetail.
- Hybrid mode runs both retrievers with k * 2 fanout, fuses with
  RRF (score(c) = Σ 1/(k_rrf + rank_m(c))), sorts fused-score DESC
  with deterministic tiebreaker (lex_rank ASC then chunk_id ASC),
  takes top query.k. Fusion math runs in f64 throughout; cast to
  f32 only at the SearchHit boundary where bounded magnitude (≤
  ~0.033 at k_rrf=60) makes f32 precision sufficient for ranking.
- Per-hit lexical preferred for snippet/citation/heading_path/
  chunker_version/embedding_model when the chunk appears in both
  retrievers — FTS5 highlighting is more user-relevant than vector's
  truncated text. Vector-only chunks fall through to vector hit data.
- index_version returns format!("hybrid:{}+{}", lex_iv, vec_iv) at
  construction; mismatched lex/vec versions trigger a tracing::warn
  so users notice stale indexes (spec line 143).

kb-search additions:
- citation_helper.rs — pub(crate) citation_from_first_span shared
  between lexical and vector retrievers. Extracted from lexical.rs;
  no behavior drift.

Tests (38 default + 3 ignored):
- 12 unit tests in hybrid.rs covering RRF math (1/61 + 1/62 within
  f32 epsilon × 10 tolerance), lexical/vector mode delegation, hybrid
  preserves single-side hits with the missing side's RetrievalDetail
  None, deterministic tiebreaker on identical fused scores, composite
  index_version, mismatched-version warn at construction.
- 2 unit tests in vector.rs covering the snippet-prefix and citation
  fallback paths.
- 11 unit tests in lexical.rs (unchanged from P2-2).
- 13 lexical integration tests (unchanged).
- 3 #[ignore] AVX-gated hybrid integration tests: disjoint-corpus
  recall (lex returns A,B; vec returns C,D; hybrid returns all 4),
  determinism over two queries, snapshot stability against
  tests/fixtures/search/hybrid/run-1.json. Snapshot fixture was
  regenerated against this branch on an AVX-enabled VM and contains
  4 real chunks (c1/c2 lex+vec, c3/c4 vec-only).
- KB_UPDATE_SNAPSHOTS=1 path now panics after writing instead of
  silently passing — matches the P3-2/P3-3 fail-loud-instead-of-
  silent-pass philosophy.

Allowed deps respected (kb-core, kb-config, kb-store-sqlite,
kb-store-vector, kb-embed, tracing, thiserror) plus pre-existing
kb-search deps from P2-2 (rusqlite, globset, serde_json, anyhow).
kb-embed-local does NOT appear — VectorRetriever takes Arc<dyn Embedder>
trait object; the concrete adapter is runtime-injected by kb-app.

Out of scope: reranker (P+), score calibration across modes (RRF is
rank-comparable so absolute calibration is P+), multimodal retrieval
(P6+).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 11:22:21 +00:00
dd42740cc0 test(p3-3): pin LanceDB snapshot fixture from AVX-capable host
Replaces the placeholder run-1.json with the captured Vec<VectorHit>
from `cargo test -p kb-store-vector --test snapshot -- --ignored` on
an AVX2-capable VM (host-passthrough CPU model). Verified by re-
running the same ignored lane and asserting against the pinned
fixture.

Full ignored lane on AVX hardware:
- upsert_search.rs: 8 / 8 pass (ensure_table idempotent, search-empty,
  upsert+search, dim-mismatch, tags filter, model isolation,
  determinism, crash-recovery promotes pending → committed).
- snapshot.rs: 1 / 1 pass against the pinned fixture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 10:58:10 +00:00
3cd5117a7e feat(p3-3): kb-store-vector — LanceDB VectorStore + V003 embedding status
First VectorStore implementation. Per-model Lance tables under
config.storage.vector_dir, two-phase upsert (SQLite-pending → Lance
MergeInsert → SQLite-committed) with crash-safe retry, search via
cosine distance with the spec's score-shift (preserves negative
similarity ranking signal that clamping would crush).

V003 migration:
- Adds status (CHECK constraint pending|committed|tombstone, default
  pending) and vector_committed columns to embedding_records.
- BEFORE DELETE trigger on chunks flips dependent rows to tombstone.
  Currently overshadowed by V001's ON DELETE CASCADE FK; trigger UPDATE
  runs first then row vanishes via CASCADE. Spec-faithful tombstone
  preservation requires recreating embedding_records to drop the
  CASCADE — deferred to a P+ migration since no production rows exist
  yet (P3-3 is the first writer). V003 SQL comment explains.

LanceVectorStore:
- ensure_table is idempotent: opens existing or creates with the
  Arrow schema (chunk_id, doc_id, embedding FixedSizeList<Float32,
  dim>, model_id, embedding_version, text, heading_path, created_at).
- IndexId computed via id_for_index with collection="chunk_embeddings",
  index_kind="flat", params_hash = blake3(descriptor JSON). Schema
  bumps automatically rotate the IndexId.
- upsert: phase-1 INSERT OR REPLACE INTO embedding_records (status=
  'pending') in a single SQLite tx; phase-2 Lance MergeInsert keyed
  on chunk_id (idempotent re-run); phase-3 UPDATE status='committed',
  vector_committed=1. If phase-2 fails the rows stay 'pending' and
  the next upsert call retries idempotently.
- search joins embedding_records WHERE status='committed' so partial-
  write rows never surface. Cosine distance from Lance ∈ [0, 2] →
  similarity = 1 - distance ∈ [-1, 1] → score = (similarity + 1)/2 ∈
  [0, 1]. NaN coerced to 0 with tracing::warn. Filter by SearchFilters
  via SqliteStore::filter_chunks (added in this commit).
- Sync trait + async LanceDB bridged by an embedded current-thread
  tokio runtime. Doc-comment on the struct flags the "do NOT call
  from inside another tokio runtime" panic (block_on cannot nest).
  kb-app's job scheduler is sync today.

kb-store-sqlite additions:
- pub fn put_embedding_records_pending(&[EmbeddingRecordRow]) — phase-1
  INSERT OR REPLACE (status='pending', vector_committed=0).
- pub fn mark_embedding_records_committed(&[EmbeddingId]) — phase-3
  single UPDATE … WHERE embedding_id IN (?, ?, …) via
  params_from_iter, guarded by WHERE status='pending' so tombstones
  don't get clobbered.
- pub fn filter_chunks(&[ChunkId], &SearchFilters) → Vec<ChunkId>
  consolidates the JOIN against documents/document_tags/
  embedding_records + path_glob via globset. Lets kb-store-vector
  honor SearchFilters without depending on rusqlite or globset
  directly. (kb-search's filter logic is structurally different —
  interleaved with the FTS5 SELECT — so it stays as-is for now;
  consolidation is a P+ refactor.)
- 4 new unit tests cover the phase-1 round-trip, empty batch,
  replay reset of pending rows, and the WHERE-status-pending guard.

Tests:
- 9 lib unit tests in kb-store-vector covering paths/sanitization,
  arrow_batch dim validation + descriptor hash, bm25-style cosine
  score shift math.
- 4 new kb-store-sqlite unit tests on filter_chunks (committed-only,
  tags/lang/trust/path_glob, order preservation, empty input).
- 4 new kb-store-sqlite unit tests on the embedding_records helpers.
- 8 integration tests in upsert_search.rs and 1 snapshot test marked
  #[ignore = "requires AVX-capable hardware (LanceDB)"]. They invoke
  require_avx_or_panic() at the top of each body so a missing-AVX
  --ignored run fails loudly instead of silently passing. This dev
  host (qemu64 model) lacks AVX so these were NOT exercised end-to-
  end here — first CI lane on AVX hardware will validate them.
- Snapshot fixture tests/fixtures/vector/run-1.json is a placeholder
  with an _comment marker. Snapshot test panics until the placeholder
  is replaced via KB_UPDATE_SNAPSHOTS=1 on AVX hardware.
- Workspace 241 passed, 19 ignored, 0 failed; cargo clippy --workspace
  --all-targets -- -D warnings clean.

Allowed deps respected (kb-core, kb-config, kb-store-sqlite, lancedb,
arrow + arrow-array + arrow-schema, serde, serde_json, tracing,
thiserror) plus forced waivers — anyhow (trait return type), tokio
+ futures (LanceDB async-only API), blake3 (params_hash). rusqlite
and globset are NOT direct deps of kb-store-vector — confirmed via
cargo metadata --no-deps. rusqlite stays in [dev-dependencies] for
the test fixture seeder only.

Out of scope: IVF/PQ index tuning (P+), image vectors (P6), kb-app
embed_index orchestration (P3-4 facade).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 10:01:31 +00:00
bcbe2b8531 feat(p3-2): kb-embed-local crate — fastembed adapter for multilingual-e5-small
First real Embedder implementation. Wraps fastembed-rs (ONNX runtime)
with the e5 prefix convention, batching, and {data_dir}/${XDG_DATA_HOME}
template expansion so model files land under config.storage.model_dir/
fastembed/ without polluting kb-config's public API.

Public surface:
- pub struct FastembedEmbedder
- pub fn new(config: &Config) -> Result<Self>
- impl kb_core::Embedder (via kb-embed re-export)

Behavior:
- Default model multilingual-e5-small (384 dims). model_id and
  model_version come from config.models.embedding.{model,version}.
- Pre-load dim check via TextEmbedding::get_model_info: dim mismatch
  bails before paying the ~470MB ONNX init cost.
- e5 prefix applied BEFORE tokenization: "passage: " for
  EmbeddingKind::Document, "query: " for EmbeddingKind::Query. Pinned
  by prefix_input unit tests.
- Batches inputs into chunks of config.models.embedding.batch_size,
  concatenates results in input order.
- L2 normalization is performed by fastembed 4.9's default transformer
  pipeline (verified at fastembed/src/text_embedding/output.rs:43);
  we skip re-normalization. Integration test pins ‖v‖ ≈ 1.0 ± 1e-3 so
  a future fastembed bump that drops this invariant fails loudly.
- Synchronous (no async runtime). Mutex serializes calls into the
  underlying ONNX session — conservative; ORT Session is Send+Sync but
  callers (kb-app indexer) batch sequentially anyway. Revisit if
  profiling shows contention.
- First-run model download surfaces via tracing::info before/after
  TextEmbedding::try_new — users no longer stare at a silent 30-60s
  pause during the 470MB pull.

Tests:
- 11 default-lane tests covering: check_dim match/mismatch (no model
  load), prefix_input Document/Query/empty, resolve_model
  known/unknown, expand_path substitution + no-op + XDG_DATA_HOME set
  + XDG_DATA_HOME unset (falls back to ~/.local/share with recursive
  ~ expansion). XDG tests serialize on a Mutex + RAII guard since
  edition 2024 makes set_var/remove_var unsafe.
- 7 #[ignore] integration tests covering: full construction with
  default config, dim-mismatch belt-and-braces, Document vs Query
  cosine differential, L2 unit norm, byte-equal determinism, batch-64
  performance under 5s, snapshot-hash stability over a 5-sentence
  multilingual fixture.
- Snapshot test fails LOUDLY when SNAPSHOT_HASH_BASELINE is 0 — prints
  the captured hash and panics with paste-back instructions, so first
  --ignored run forces the maintainer to pin the baseline rather than
  silently passing.
- Workspace: 222 tests pass (default lane); clippy clean.

Allowed deps respected: kb-config, kb-embed (re-exports kb-core
trait surface), fastembed = "4.9", tracing, anyhow. tokenizers and
ort enter transitively through fastembed; reqwest/hyper/hf-hub also
transitive (model download is fastembed's responsibility per spec
carve-out). No direct kb-core dep needed — re-exports cover it.

Pinned to fastembed 4.x rather than the recent 5.x to limit blast
radius; consider bump when p3-3 (lancedb-store) consumes the embedder
output shape.

Out of scope: reranker (P+), Ollama embedding endpoint, candle
adapter, image embeddings (P6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 08:39:38 +00:00
2e3eb8f437 feat(p3-1): kb-embed crate — Embedder trait re-export + MockEmbedder
Establishes the kb-embed trait crate so concrete embedding adapters
(p3-2 fastembed, future ollama-embed/candle) target a stable surface.
Pure re-export of kb_core::{Embedder, EmbeddingInput, EmbeddingKind,
EmbeddingModelId, EmbeddingVersion} plus a feature-gated deterministic
mock for downstream tests.

MockEmbedder (cfg(feature = "mock"), default OFF):
- Per-component hash recipe: blake3(seed_le8 || kind_byte ||
  text_len_le8 || text || i_le8). Length-prefixed text avoids the
  domain-separation ambiguity where two (text, i) pairs could shift
  bytes between text tail and the i field.
- Document = 0u8, Query = 1u8 — same text different kind yields
  different vectors (mirrors e5 prefix behaviour).
- Per component: blake3 first 8 bytes → u64 → reinterpret as i64 →
  f64/i64::MAX → f32. i64::MIN gives -1.0000000000000002 which f32
  rounds to -1.0; range [-1, 1] holds.
- L2 unit-normalised. Norm sums in f64 (avoid catastrophic precision
  loss) before f32 cast. Zero-norm guard skips the divide.
- with_seed(...) constructor lets two embedders share identity but
  produce different vectors — useful for downstream parametric tests.

Helpers:
- assert_vector_shape(vecs, dims) — len + finite check.
- assert_unit_norm(vecs, tolerance) — caller-supplied tolerance;
  5e-4 documented as safe for dims=384 under f32 epsilon × √dims.

Tests:
- cargo test -p kb-embed (no features): 2 reexport/dyn-dispatch tests.
- cargo test -p kb-embed --features mock: 7 tests including 100-case
  proptest asserting len == dims, all finite, ‖v‖ ≈ 1.0 within
  tolerance, Doc(text) byte-equal Doc(text), Doc(text) ≠ Query(text),
  Doc(text1) ≠ Doc(text2).
- All 220 workspace tests pass; clippy clean for both default and
  mock-on feature configurations.

Symbol gating: nm on the release rlib confirms zero MockEmbedder
symbols under default features; three trait impl symbols under
--features mock. Spec invariant "release builds MUST NOT include
MockEmbedder" verified at the symbol level.

Allowed deps respected: kb-core, kb-config, serde, thiserror, tracing,
plus anyhow (forced by trait return type) and blake3 (justified by
the determinism contract; already in workspace lockfile via kb-core).
No fastembed/ort/tokenizers anywhere.

Out of scope: real adapter (p3-2), reranker traits (P+).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 08:15:44 +00:00