docs(rename): kb → kebab — README, tasks/, docs/, design doc, report

마지막 commit. 모든 .md 안의 `kb` 단어 일괄 갱신.

- 19 개 crate 이름 (`kb-core`, `kb-app`, …) → `kebab-*` (Rust 모듈
  path 표기 `kb_*` → `kebab_*` 포함).
- 미래 component (`kb-tui`, `kb-desktop`, `kb-asr-whisper`, `kb-ocr`,
  `kb-mcp`, `kb-vlm`, `kb-rerank`, `kb-vision-ocr`, `kb-index`,
  `kb-smoke`, `kb-architecture`) → `kebab-*` (P6+ 가 시작될 때
  같은 prefix 사용).
- CLI 명령 예제: `kb ingest` / `kb search` / `kb ask` / `kb init` /
  `kb doctor` / `kb inspect` / `kb list` / `kb eval` →
  `kebab <verb>`. fenced code block + 인라인 backtick 모두.
- XDG paths + env vars + binary 경로 (`target/release/kb` →
  `target/release/kebab`) 동기화.
- design doc / 최초 보고서 / SMOKE / HOTFIXES / phase epic / task
  spec 모든 reference 통일.
- task-decomposition.md 의 `git -c user.name=kb` 는 과거 git history
  기록용 author 정보라 그대로 유지 (실제 git history 의 author 는
  변경 불가).
- `tasks/phase-5-evaluation.md` 의 `status: planned` →
  `completed` 도 같이 (P5-1 + P5-2 PR 머지 후 미반영분).

## 검증

- `grep -rEn "\bkb-[a-z]|\bkb_[a-z]|\.config/kb\b|kb\.sqlite|\bKB_[A-Z]"
   --include="*.md"` 0 hits (task-decomposition.md 의 git author
  제외).
- 모든 file path reference 살아있음 (renamed file 들 모두 새 path
  로 update).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-02 04:01:55 +00:00
parent f1a448d6dc
commit f9714aa5cb
56 changed files with 1324 additions and 1324 deletions

View File

@@ -1,12 +1,12 @@
---
phase: P6
component: kb-parse-image (image extractor + EXIF)
component: kebab-parse-image (image extractor + EXIF)
task_id: p6-1
title: "Image Extractor producing single-block CanonicalDocument + EXIF metadata"
status: planned
depends_on: [p0-1, p1-6]
unblocks: [p6-2, p6-3]
contract_source: ../../docs/superpowers/specs/2026-04-27-kb-final-form-design.md
contract_source: ../../docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
contract_sections: [§3.4 Block::ImageRef + ImageRefBlock, §3.7a OcrText/ModelCaption stubs, §9.1 image extraction policy, §9 versioning]
---
@@ -22,8 +22,8 @@ Establishes the image-as-document contract and decouples extraction (asset → I
## Allowed dependencies
- `kb-core`
- `kb-config`
- `kebab-core`
- `kebab-config`
- `image = "0.25"` (decoding for size + format detect)
- `kamadak-exif` for EXIF
- `serde`, `serde_json`
@@ -33,31 +33,31 @@ Establishes the image-as-document contract and decouples extraction (asset → I
## Forbidden dependencies
- `kb-source-fs`, `kb-parse-md`, `kb-normalize`, `kb-chunk`, `kb-store-*`, `kb-embed*`, `kb-search`, `kb-llm*`, `kb-rag`, `kb-tui`, `kb-desktop`, OCR libs, LLM libs
- `kebab-source-fs`, `kebab-parse-md`, `kebab-normalize`, `kebab-chunk`, `kebab-store-*`, `kebab-embed*`, `kebab-search`, `kebab-llm*`, `kebab-rag`, `kebab-tui`, `kebab-desktop`, OCR libs, LLM libs
## Inputs
| input | type | source |
|-------|------|--------|
| `RawAsset` | `kb_core::RawAsset` | from `kb-source-fs` |
| `RawAsset` | `kebab_core::RawAsset` | from `kebab-source-fs` |
| image bytes | `&[u8]` | filesystem |
| `parser_version` | `kb_core::ParserVersion` | constant in this crate (`"image-meta-v1"`) |
| `parser_version` | `kebab_core::ParserVersion` | constant in this crate (`"image-meta-v1"`) |
## Outputs
| output | type | downstream |
|--------|------|------------|
| `CanonicalDocument` | `kb_core::CanonicalDocument` | `kb-chunk` (image-region chunker) → `kb-store-sqlite` |
| `CanonicalDocument` | `kebab_core::CanonicalDocument` | `kebab-chunk` (image-region chunker) → `kebab-store-sqlite` |
## Public surface (signatures only — no new types)
```rust
pub struct ImageExtractor;
impl kb_core::Extractor for ImageExtractor {
fn supports(&self, m: &kb_core::MediaType) -> bool { matches!(m, kb_core::MediaType::Image(_)) }
fn parser_version(&self) -> kb_core::ParserVersion { kb_core::ParserVersion("image-meta-v1".into()) }
fn extract(&self, ctx: &kb_core::ExtractContext, bytes: &[u8]) -> anyhow::Result<kb_core::CanonicalDocument>;
impl kebab_core::Extractor for ImageExtractor {
fn supports(&self, m: &kebab_core::MediaType) -> bool { matches!(m, kebab_core::MediaType::Image(_)) }
fn parser_version(&self) -> kebab_core::ParserVersion { kebab_core::ParserVersion("image-meta-v1".into()) }
fn extract(&self, ctx: &kebab_core::ExtractContext, bytes: &[u8]) -> anyhow::Result<kebab_core::CanonicalDocument>;
}
```
@@ -69,7 +69,7 @@ impl kb_core::Extractor for ImageExtractor {
- `metadata.source_type = SourceType::Reference` (per design enum); `trust_level = TrustLevel::Primary`; `tags`/`aliases` empty.
- `metadata.user["exif"]` = JSON object with whitelisted EXIF tags (DateTimeOriginal, GPS lat/lon, Make, Model, Orientation, Software). Missing tags omitted.
- `metadata.user["dimensions"] = { "w": <u32>, "h": <u32>, "format": "<png|jpeg|...>" }`.
- `provenance` includes `Discovered`, `Parsed` events (no Normalized — ID assignment happens here directly per §3.4 stub from p1-4 logic, OR pipe through `kb-normalize` if available; this task's choice: emit a fully formed CanonicalDocument with deterministic IDs by calling `kb_core::id_for_doc` and `kb_core::id_for_block` directly).
- `provenance` includes `Discovered`, `Parsed` events (no Normalized — ID assignment happens here directly per §3.4 stub from p1-4 logic, OR pipe through `kebab-normalize` if available; this task's choice: emit a fully formed CanonicalDocument with deterministic IDs by calling `kebab_core::id_for_doc` and `kebab_core::id_for_block` directly).
- Failure modes:
- Truncated/corrupt image → still emits a CanonicalDocument with `dimensions = null`, EXIF empty, `Provenance` warning event with the decoder error message.
- Unsupported format → `anyhow::Error` (caller skips).
@@ -77,7 +77,7 @@ impl kb_core::Extractor for ImageExtractor {
## Storage / wire effects
- None directly (the caller persists via `kb-store-sqlite`).
- None directly (the caller persists via `kebab-store-sqlite`).
## Test plan
@@ -90,12 +90,12 @@ impl kb_core::Extractor for ImageExtractor {
| determinism | identical bytes → identical `doc_id`, `block_id` across two runs | inline |
| snapshot | `CanonicalDocument` JSON stable for fixture | `fixtures/image/red-100x50.png` |
All tests under `cargo test -p kb-parse-image`.
All tests under `cargo test -p kebab-parse-image`.
## Definition of Done
- [ ] `cargo check -p kb-parse-image` passes
- [ ] `cargo test -p kb-parse-image` passes
- [ ] `cargo check -p kebab-parse-image` passes
- [ ] `cargo test -p kebab-parse-image` passes
- [ ] No OCR/caption/embedding code present
- [ ] No imports outside Allowed dependencies
- [ ] PR links design §3.4, §9.1

View File

@@ -1,12 +1,12 @@
---
phase: P6
component: kb-parse-image (OCR adapter)
component: kebab-parse-image (OCR adapter)
task_id: p6-2
title: "OcrEngine trait + Tesseract adapter (Apple Vision feature-gated)"
status: planned
depends_on: [p6-1]
unblocks: [p6-3]
contract_source: ../../docs/superpowers/specs/2026-04-27-kb-final-form-design.md
contract_source: ../../docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
contract_sections: [§3.4 ImageRefBlock.ocr, §3.7a OcrText/OcrRegion, §9.1 OCR vs caption provenance]
---
@@ -22,9 +22,9 @@ Strict separation of OCR (observed text) from caption (model-generated). Confini
## Allowed dependencies
- `kb-core`
- `kb-config`
- `kb-parse-image` (consumes its types)
- `kebab-core`
- `kebab-config`
- `kebab-parse-image` (consumes its types)
- `tesseract = "0.13"` (feature `tesseract`, default ON)
- For feature `apple-vision`: `std::process::Command` only (sidecar binary, not a Rust dep)
- `serde`, `serde_json`
@@ -34,21 +34,21 @@ Strict separation of OCR (observed text) from caption (model-generated). Confini
## Forbidden dependencies
- `kb-source-fs`, `kb-parse-md`, `kb-normalize`, `kb-chunk`, `kb-store-*`, `kb-embed*`, `kb-search`, `kb-llm*`, `kb-rag`, `kb-tui`, `kb-desktop`
- `kebab-source-fs`, `kebab-parse-md`, `kebab-normalize`, `kebab-chunk`, `kebab-store-*`, `kebab-embed*`, `kebab-search`, `kebab-llm*`, `kebab-rag`, `kebab-tui`, `kebab-desktop`
## Inputs
| input | type | source |
|-------|------|--------|
| image bytes | `&[u8]` | from extractor |
| optional language hint | `kb_core::Lang` | metadata |
| `kb-config` OCR settings | engine name, languages | runtime |
| optional language hint | `kebab_core::Lang` | metadata |
| `kebab-config` OCR settings | engine name, languages | runtime |
## Outputs
| output | type | downstream |
|--------|------|------------|
| `OcrText` | `kb_core::OcrText` | merged into `ImageRefBlock.ocr` |
| `OcrText` | `kebab_core::OcrText` | merged into `ImageRefBlock.ocr` |
## Public surface (signatures only — no new types)
@@ -56,11 +56,11 @@ Strict separation of OCR (observed text) from caption (model-generated). Confini
pub trait OcrEngine: Send + Sync {
fn engine_name(&self) -> &'static str;
fn engine_version(&self) -> String;
fn recognize(&self, image_bytes: &[u8], lang_hint: Option<&kb_core::Lang>) -> anyhow::Result<kb_core::OcrText>;
fn recognize(&self, image_bytes: &[u8], lang_hint: Option<&kebab_core::Lang>) -> anyhow::Result<kebab_core::OcrText>;
}
pub struct TesseractOcr { /* internal: lazy api handle */ }
impl TesseractOcr { pub fn new(config: &kb_config::Config) -> anyhow::Result<Self>; }
impl TesseractOcr { pub fn new(config: &kebab_config::Config) -> anyhow::Result<Self>; }
impl OcrEngine for TesseractOcr { /* per trait */ }
#[cfg(feature = "apple-vision")]
@@ -71,8 +71,8 @@ impl OcrEngine for AppleVisionOcr { /* per trait */ }
pub fn apply_ocr(
engine: &dyn OcrEngine,
image_bytes: &[u8],
block: &mut kb_core::ImageRefBlock,
lang_hint: Option<&kb_core::Lang>,
block: &mut kebab_core::ImageRefBlock,
lang_hint: Option<&kebab_core::Lang>,
) -> anyhow::Result<()>;
```
@@ -85,7 +85,7 @@ pub fn apply_ocr(
- `joined` = `regions.iter().map(|r| r.text).join(" ")` (no smart layout reconstruction in v1).
- `engine = "tesseract"`, `engine_version = <tesseract version string>`. The `tesseract` crate (0.13+) does NOT expose a stable Rust `version()` accessor. Use one of: (a) call libtesseract's `TessVersion()` via the bundled FFI surface, OR (b) at adapter construction, shell-out `tesseract --version` once and cache the parsed `"5.3.4"`-style string. Both are deterministic for a fixed install. Pin the chosen approach in the implementation PR.
- Apple Vision sidecar (feature `apple-vision`):
- Spawn a small Swift binary `kb-vision-ocr` (path from `config.ocr.apple_vision_binary`) feeding the image via stdin and reading JSON `{ regions: [{x,y,w,h,text,confidence}, ...] }` from stdout.
- Spawn a small Swift binary `kebab-vision-ocr` (path from `config.ocr.apple_vision_binary`) feeding the image via stdin and reading JSON `{ regions: [{x,y,w,h,text,confidence}, ...] }` from stdout.
- Same threshold and `joined` rules as Tesseract. `engine = "apple-vision"`, `engine_version = sidecar's --version`.
- This subagent task does NOT write the Swift sidecar; it only wires the Rust side. Document the expected sidecar interface in `docs/spec/sidecar-vision.md` (separate doc spec stub, optional).
- `apply_ocr` calls `engine.recognize`, sets `block.ocr = Some(text)`, and appends a `Provenance::OcrApplied` event in the caller's CanonicalDocument (caller responsibility — this task exposes a helper).
@@ -109,12 +109,12 @@ pub fn apply_ocr(
| determinism | two runs of recognize on same input → identical OcrText | fixture |
| `#[cfg(feature = "apple-vision")]` smoke | sidecar invocation captured (mock binary echoes fixed JSON) | inline mock |
All tests under `cargo test -p kb-parse-image ocr`. Tesseract install required on CI host.
All tests under `cargo test -p kebab-parse-image ocr`. Tesseract install required on CI host.
## Definition of Done
- [ ] `cargo check -p kb-parse-image --features tesseract` passes
- [ ] `cargo test -p kb-parse-image ocr` passes
- [ ] `cargo check -p kebab-parse-image --features tesseract` passes
- [ ] `cargo test -p kebab-parse-image ocr` passes
- [ ] `apple-vision` feature compiles on macOS and gracefully no-ops on Linux
- [ ] No imports outside Allowed dependencies
- [ ] PR links design §3.4, §3.7a, §9.1
@@ -129,5 +129,5 @@ All tests under `cargo test -p kb-parse-image ocr`. Tesseract install required o
## Risks / notes
- Tesseract performance varies wildly with image quality; document `min_confidence` and default page-segmentation mode.
- Apple Vision sidecar requires code signing for distribution; for v1 dev builds, accept unsigned binary from `~/.local/bin/kb-vision-ocr`.
- Apple Vision sidecar requires code signing for distribution; for v1 dev builds, accept unsigned binary from `~/.local/bin/kebab-vision-ocr`.
- Large image downscale loses small-text recognition; expose `config.ocr.max_pixels` so power users can tune.

View File

@@ -1,12 +1,12 @@
---
phase: P6
component: kb-parse-image (caption adapter)
component: kebab-parse-image (caption adapter)
task_id: p6-3
title: "ModelCaption adapter (LanguageModel-driven, feature-gated)"
status: planned
depends_on: [p6-1, p4-2]
unblocks: []
contract_source: ../../docs/superpowers/specs/2026-04-27-kb-final-form-design.md
contract_source: ../../docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
contract_sections: [§3.4 ImageRefBlock.caption, §3.7a ModelCaption, §9.1 caption (model-generated, low trust)]
---
@@ -22,10 +22,10 @@ Captioning closes the multimodal loop. Strict separation from OCR keeps trust le
## Allowed dependencies
- `kb-core`
- `kb-config`
- `kb-parse-image`
- `kb-llm` (LanguageModel trait)
- `kebab-core`
- `kebab-config`
- `kebab-parse-image`
- `kebab-llm` (LanguageModel trait)
- `base64`
- `serde`, `serde_json`
- `image` (resize for prompt cost control)
@@ -34,7 +34,7 @@ Captioning closes the multimodal loop. Strict separation from OCR keeps trust le
## Forbidden dependencies
- `kb-source-fs`, `kb-parse-md`, `kb-normalize`, `kb-chunk`, `kb-store-*`, `kb-embed*`, `kb-search`, `kb-rag`, `kb-llm-local` (only via trait), `kb-tui`, `kb-desktop`
- `kebab-source-fs`, `kebab-parse-md`, `kebab-normalize`, `kebab-chunk`, `kebab-store-*`, `kebab-embed*`, `kebab-search`, `kebab-rag`, `kebab-llm-local` (only via trait), `kebab-tui`, `kebab-desktop`
## Inputs
@@ -42,28 +42,28 @@ Captioning closes the multimodal loop. Strict separation from OCR keeps trust le
|-------|------|--------|
| image bytes | `&[u8]` | extractor |
| `dyn LanguageModel` (vision-capable) | runtime | injected |
| `kb-config.image.caption` | `{ enabled, max_pixels, prompt_template_version }` | runtime |
| `kebab-config.image.caption` | `{ enabled, max_pixels, prompt_template_version }` | runtime |
## Outputs
| output | type | downstream |
|--------|------|------------|
| `ModelCaption` | `kb_core::ModelCaption` | merged into `ImageRefBlock.caption` |
| `ModelCaption` | `kebab_core::ModelCaption` | merged into `ImageRefBlock.caption` |
## Public surface (signatures only — no new types)
```rust
pub fn caption_image(
llm: &dyn kb_core::LanguageModel,
llm: &dyn kebab_core::LanguageModel,
image_bytes: &[u8],
cfg: &kb_config::Config,
) -> anyhow::Result<kb_core::ModelCaption>;
cfg: &kebab_config::Config,
) -> anyhow::Result<kebab_core::ModelCaption>;
pub fn apply_caption(
llm: &dyn kb_core::LanguageModel,
llm: &dyn kebab_core::LanguageModel,
image_bytes: &[u8],
block: &mut kb_core::ImageRefBlock,
cfg: &kb_config::Config,
block: &mut kebab_core::ImageRefBlock,
cfg: &kebab_config::Config,
) -> anyhow::Result<()>;
```
@@ -86,7 +86,7 @@ pub fn apply_caption(
## Storage / wire effects
- None directly. Caller persists via `kb-store-sqlite`.
- None directly. Caller persists via `kebab-store-sqlite`.
## Test plan
@@ -99,12 +99,12 @@ pub fn apply_caption(
| unit | downscale honors `max_pixels` (resulting bytes < some threshold) | fixture large image |
| determinism | identical input + temperature=0 + seed=0 → identical caption (mock) | inline |
All tests under `cargo test -p kb-parse-image caption` with mock LM only.
All tests under `cargo test -p kebab-parse-image caption` with mock LM only.
## Definition of Done
- [ ] `cargo check -p kb-parse-image --features caption` passes
- [ ] `cargo test -p kb-parse-image caption` passes
- [ ] `cargo check -p kebab-parse-image --features caption` passes
- [ ] `cargo test -p kebab-parse-image caption` passes
- [ ] No imports outside Allowed dependencies
- [ ] Feature default OFF; only on when user opts in via config
- [ ] PR links design §3.4 ImageRefBlock.caption, §9.1