마지막 commit. 모든 .md 안의 `kb` 단어 일괄 갱신. - 19 개 crate 이름 (`kb-core`, `kb-app`, …) → `kebab-*` (Rust 모듈 path 표기 `kb_*` → `kebab_*` 포함). - 미래 component (`kb-tui`, `kb-desktop`, `kb-asr-whisper`, `kb-ocr`, `kb-mcp`, `kb-vlm`, `kb-rerank`, `kb-vision-ocr`, `kb-index`, `kb-smoke`, `kb-architecture`) → `kebab-*` (P6+ 가 시작될 때 같은 prefix 사용). - CLI 명령 예제: `kb ingest` / `kb search` / `kb ask` / `kb init` / `kb doctor` / `kb inspect` / `kb list` / `kb eval` → `kebab <verb>`. fenced code block + 인라인 backtick 모두. - XDG paths + env vars + binary 경로 (`target/release/kb` → `target/release/kebab`) 동기화. - design doc / 최초 보고서 / SMOKE / HOTFIXES / phase epic / task spec 모든 reference 통일. - task-decomposition.md 의 `git -c user.name=kb` 는 과거 git history 기록용 author 정보라 그대로 유지 (실제 git history 의 author 는 변경 불가). - `tasks/phase-5-evaluation.md` 의 `status: planned` → `completed` 도 같이 (P5-1 + P5-2 PR 머지 후 미반영분). ## 검증 - `grep -rEn "\bkb-[a-z]|\bkb_[a-z]|\.config/kb\b|kb\.sqlite|\bKB_[A-Z]" --include="*.md"` 0 hits (task-decomposition.md 의 git author 제외). - 모든 file path reference 살아있음 (renamed file 들 모두 새 path 로 update). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
101 lines
4.0 KiB
Markdown
101 lines
4.0 KiB
Markdown
---
|
|
phase: P2
|
|
component: kebab-store-sqlite (FTS5 migration)
|
|
task_id: p2-1
|
|
title: "FTS5 virtual table + triggers (V002 migration)"
|
|
status: completed
|
|
depends_on: [p1-6]
|
|
unblocks: [p2-2]
|
|
contract_source: ../../docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
|
|
contract_sections: [§5.5 chunks_fts + triggers, §9 versioning]
|
|
---
|
|
|
|
# p2-1 — FTS5 virtual table + triggers
|
|
|
|
## Goal
|
|
|
|
Add `chunks_fts` virtual table and three sync triggers via migration `V002__fts.sql`. Backfill existing chunks if any.
|
|
|
|
## Why now / why this size
|
|
|
|
`chunks_fts` is the lexical index for `kebab-search`. Splitting it from p1-6 keeps P1 focused on relational data; bringing it as `V002` lets users upgrade an existing P1 DB without re-ingesting.
|
|
|
|
## Allowed dependencies
|
|
|
|
- `kebab-core`
|
|
- `kebab-config`
|
|
- `kebab-store-sqlite` (extends migrations)
|
|
- `rusqlite`
|
|
- `refinery`
|
|
|
|
## Forbidden dependencies
|
|
|
|
- `kebab-source-fs`, `kebab-parse-md`, `kebab-normalize`, `kebab-chunk`, `kebab-store-vector`, `kebab-embed*`, `kebab-search` (consumer is p2-2), `kebab-llm*`, `kebab-rag`, `kebab-tui`, `kebab-desktop`
|
|
|
|
## Inputs
|
|
|
|
| input | type | source |
|
|
|-------|------|--------|
|
|
| existing `chunks` rows | SQLite | from p1-6 |
|
|
| migration runner | `refinery` | from p1-6 |
|
|
|
|
## Outputs
|
|
|
|
| output | type | downstream |
|
|
|--------|------|------------|
|
|
| `chunks_fts` virtual table populated | SQLite | p2-2 lexical retriever |
|
|
| three triggers synced with `chunks` | SQLite | every later chunk write |
|
|
|
|
## Public surface (signatures only — no new types)
|
|
|
|
```rust
|
|
pub fn rebuild_chunks_fts(conn: &rusqlite::Connection) -> anyhow::Result<()>;
|
|
```
|
|
|
|
(Used by `kebab index --rebuild-fts`. Re-runs `INSERT INTO chunks_fts SELECT ... FROM chunks` after `DELETE FROM chunks_fts;`.)
|
|
|
|
## Behavior contract
|
|
|
|
- Migration file `migrations/V002__fts.sql` ships exactly the SQL in design §5.5 (FTS5 virtual table with `unicode61 remove_diacritics 2` tokenizer + `chunks_ai` / `chunks_ad` / `chunks_au` triggers).
|
|
- On migration apply, backfill: `INSERT INTO chunks_fts(chunk_id, doc_id, heading_path, text) SELECT chunk_id, doc_id, heading_path_json, text FROM chunks;`.
|
|
- `rebuild_chunks_fts` is idempotent: full delete then re-insert from `chunks`.
|
|
- Triggers ensure that every future `INSERT`/`UPDATE`/`DELETE` on `chunks` keeps `chunks_fts` in sync within the same transaction.
|
|
- `chunks_fts` row count must equal `chunks` row count after any successful migration / rebuild.
|
|
|
|
## Storage / wire effects
|
|
|
|
- Writes: `chunks_fts` virtual table inside `kebab.sqlite`.
|
|
- Reads: existing `chunks` rows for backfill.
|
|
|
|
## Test plan
|
|
|
|
| kind | description | fixture / data |
|
|
|------|-------------|----------------|
|
|
| migration | apply `V002` to a DB seeded with N chunks; `chunks_fts` contains exactly N rows | tmp DB seeded |
|
|
| trigger | INSERT into `chunks` propagates to `chunks_fts` | tmp DB |
|
|
| trigger | DELETE from `chunks` removes the corresponding `chunks_fts` row | tmp DB |
|
|
| trigger | UPDATE of `chunks.text` updates `chunks_fts` text | tmp DB |
|
|
| function | `rebuild_chunks_fts` produces deterministic content equal to fresh backfill | tmp DB |
|
|
| migration | running `V002` twice is a no-op (refinery handles idempotency) | tmp DB |
|
|
|
|
All tests under `cargo test -p kebab-store-sqlite fts`.
|
|
|
|
## Definition of Done
|
|
|
|
- [ ] `cargo check -p kebab-store-sqlite` passes
|
|
- [ ] `cargo test -p kebab-store-sqlite fts` passes
|
|
- [ ] `migrations/V002__fts.sql` matches design §5.5 verbatim (CI diff check)
|
|
- [ ] No imports outside Allowed dependencies
|
|
- [ ] PR links design §5.5
|
|
|
|
## Out of scope
|
|
|
|
- Search query implementation (p2-2).
|
|
- Vector / hybrid search (P3).
|
|
- Korean morphological tokenizer (kept as P+ note; default `unicode61 remove_diacritics 2`).
|
|
|
|
## Risks / notes
|
|
|
|
- FTS5 triggers run inside the same transaction as their host `chunks` mutation; bulk ingest performance may need batching considerations later.
|
|
- `chunks_fts` is a **content-less** FTS5 table per §5.5 (with UNINDEXED `chunk_id`/`doc_id`). Tests should rely on `bm25(chunks_fts)` ranking only — not on raw scoring values.
|