V007 → V009 업그레이드 시 기존 chunks 의 tokenized_korean_text 가
NULL — 첫 App::open_with_config 호출 시 자동으로 lindera ko-dic
으로 분해 후 UPDATE. chunks_au trigger 가 chunks_fts 를 자동 재-index.
사용자 재-ingest 불필요.
- crates/kebab-store-sqlite/src/store.rs:
backfill_tokenized_korean_text(progress_cb, tokenize) API. 1000 row 마다
commit + progress 콜백. idempotent (IS NULL 필터로 partial
completion 재실행 안전). tokenizer 를 파라미터로 받아 §8 dep 경계 유지.
- crates/kebab-app/src/app.rs::open_with_config: run_migrations 직후
backfill 호출. 실패 시 warn log 만 (App open 은 성공 — vector/hybrid
mode 계속 가능). 500 row 마다 info log progress.
- crates/kebab-store-sqlite/tests/fts.rs:
backfill_tokenized_korean_text_populates_nullable_rows 단위 test
(idempotency 포함).
- clippy pre-existing 오류 수정 (redundant_closure, map_unwrap_or,
cast_lossless, uninlined_format_args — kebab-app/ingest_log.rs,
pdf_ocr_apply.rs, app.rs, tests/ocr_inspect_smoke.rs).
Spec: docs/superpowers/specs/2026-05-28-v0.20.x-korean-morphological-tokenizer-spec.md §8.1, §8.2
Plan: docs/superpowers/plans/2026-05-28-v0.20.x-korean-morphological-tokenizer-plan.md (S4)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two new wire schemas land as additive minor: ocr_stats.v1 (corpus-wide
aggregate — total_events, success_rate, p50/p90/p99/max_ms, by_engine,
top-10 by_doc by failure count) and ocr_failures.v1 (per-doc or
corpus-wide recent failures, with --doc-id + --limit). Both ship via
new CLI subcommands `kebab inspect ocr-stats` / `inspect ocr-failures`.
App gains four facade methods: inspect_ocr_stats /
inspect_ocr_failures plus their *_with_config companions — required by
CLAUDE.md "the facade rule" so `--config <path>` is honored. The CLI
dispatch arms thread cfg explicitly into the _with_config form.
Runtime introspection emit (WIRE_SCHEMAS in schema.rs) gains two
entries; the meta JSON Schema (schema.schema.json) is untouched
because its wire.schemas is pattern-based, not enum-based.
ingest_log::percentiles extended to (p50, p90, p99, max). p99 surfaces
only via inspect ocr-stats; IngestSummary (round 1) stays 3-percentile.
SKILL.md synced with the two new schemas (AC-13).
Closure r2 G2 (facade *_with_config pair) + G3 (runtime emit, not
meta schema file) + closure r1 F4 (p99) resolved.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>