Files
kebab/tasks/p0/p0-1-skeleton.md
altair823 f9714aa5cb docs(rename): kb → kebab — README, tasks/, docs/, design doc, report
마지막 commit. 모든 .md 안의 `kb` 단어 일괄 갱신.

- 19 개 crate 이름 (`kb-core`, `kb-app`, …) → `kebab-*` (Rust 모듈
  path 표기 `kb_*` → `kebab_*` 포함).
- 미래 component (`kb-tui`, `kb-desktop`, `kb-asr-whisper`, `kb-ocr`,
  `kb-mcp`, `kb-vlm`, `kb-rerank`, `kb-vision-ocr`, `kb-index`,
  `kb-smoke`, `kb-architecture`) → `kebab-*` (P6+ 가 시작될 때
  같은 prefix 사용).
- CLI 명령 예제: `kb ingest` / `kb search` / `kb ask` / `kb init` /
  `kb doctor` / `kb inspect` / `kb list` / `kb eval` →
  `kebab <verb>`. fenced code block + 인라인 backtick 모두.
- XDG paths + env vars + binary 경로 (`target/release/kb` →
  `target/release/kebab`) 동기화.
- design doc / 최초 보고서 / SMOKE / HOTFIXES / phase epic / task
  spec 모든 reference 통일.
- task-decomposition.md 의 `git -c user.name=kb` 는 과거 git history
  기록용 author 정보라 그대로 유지 (실제 git history 의 author 는
  변경 불가).
- `tasks/phase-5-evaluation.md` 의 `status: planned` →
  `completed` 도 같이 (P5-1 + P5-2 PR 머지 후 미반영분).

## 검증

- `grep -rEn "\bkb-[a-z]|\bkb_[a-z]|\.config/kb\b|kb\.sqlite|\bKB_[A-Z]"
   --include="*.md"` 0 hits (task-decomposition.md 의 git author
  제외).
- 모든 file path reference 살아있음 (renamed file 들 모두 새 path
  로 update).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 04:01:55 +00:00

373 lines
22 KiB
Markdown

---
phase: P0
component: workspace + kebab-core + kebab-config + kebab-app + kebab-cli
task_id: p0-1
title: "Workspace skeleton + frozen domain types/traits + ID recipe + facade"
status: completed
depends_on: []
unblocks: [p1-1, p1-2, p1-3, p1-4, p1-5, p1-6, p2-1, p2-2, p3-1, p3-2, p3-3, p3-4, p4-1, p4-2, p4-3, p5-1, p5-2, p6-1, p6-2, p6-3, p7-1, p7-2, p8-1, p8-2, p9-1, p9-2, p9-3, p9-4, p9-5]
contract_source: ../../docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
contract_sections: [§3 (all), §4, §5.1 schema_meta+migrations, §6 (config + XDG), §7 (all traits), §8 module boundaries, §9 versioning, §10 errors+exit codes, §2.8 wire schema_version]
---
# p0-1 — Workspace skeleton + frozen contracts
## Goal
Stand up the Cargo workspace (Rust 2024, resolver=3) with `kebab-core`, `kebab-parse-types`, `kebab-config`, `kebab-app`, `kebab-cli` crates. Freeze every domain type, trait, ID recipe, error type, and CLI entry shape per the frozen design doc so that all subsequent component tasks compile against stable contracts.
## Why now / why this size
Every other task imports `kebab-core`. If types or trait signatures wobble after this point, every downstream task spec drifts. This task is large but indivisible: types + traits + ID recipe + facade + CLI skeleton + wire schema stubs must land together so the rest of the workspace can compile against them.
## Allowed dependencies
- workspace `[workspace.dependencies]`: `anyhow = "1"`, `thiserror = "2"`, `serde = { version = "1", features = ["derive"] }`, `serde_json = "1"`, `time = { version = "0.3", features = ["serde", "macros"] }`, `uuid = { version = "1", features = ["v7", "serde"] }`, `blake3 = "1"`, `tracing = "0.1"`
- per crate:
- `kebab-core`: workspace deps + `serde_json::Map`, `serde-json-canonicalizer`, `unicode-normalization`
- `kebab-parse-types`: workspace deps + `kebab-core` ONLY (no parsers, no stores, no normalize). Defines parser intermediate representations per design §3.7b.
- `kebab-config`: workspace deps + `toml = "0.8"`, `dirs = "5"` (XDG paths)
- `kebab-app`: workspace deps + `kebab-core`, `kebab-config`, `tracing-subscriber`, `tracing-appender`
- `kebab-cli`: workspace deps + `kebab-core`, `kebab-config`, `kebab-app`, `clap = { version = "4", features = ["derive"] }`
## Forbidden dependencies
- `kebab-core` MUST NOT depend on any other `kebab-*` crate.
- `kebab-parse-types` MUST depend ONLY on `kebab-core`. No parser libraries (`pulldown-cmark`, `pdf-extract`, `image`, `whisper-rs`, …), no other `kebab-*` crate.
- `kebab-config` MUST NOT depend on `kebab-app`, `kebab-cli`, parsers, stores, embedders, search, llm, rag, tui, desktop.
- `kebab-app` MUST NOT yet depend on parsers/stores/embedders/search/llm/rag (those crates do not exist yet — facade methods stub out and return `unimplemented!()` or `anyhow::bail!("not yet wired (Pn-i)")`).
- `kebab-cli` MUST NOT call any non-`kebab-app` crate directly.
## Inputs
| input | type | source |
|-------|------|--------|
| frozen design doc | Markdown | `docs/superpowers/specs/2026-04-27-kebab-final-form-design.md` |
| user `kebab` invocation | command-line args | end user |
## Outputs
| output | type | downstream consumer |
|--------|------|---------------------|
| compiling workspace | Rust crates | every later task |
| `kebab-core` types/traits | Rust API | every other crate |
| `kebab-core` ID functions | Rust API | parsers, normalize, chunkers, embedders, search, rag |
| `kebab-config::Config` | Rust struct | every other crate |
| `kebab-app` facade methods (stubs) | Rust API | `kebab-cli`, future TUI/desktop |
| `kebab` binary | executable | end user |
| `docs/wire-schema/v1/*.schema.json` stubs | JSON Schema files | future wire emitters and consumers |
| `docs/spec/*.md` stubs (link to frozen design) | Markdown | future contributors |
## Public surface (signatures only — no new types)
All types/traits below are defined in `kebab-core` exactly per design §3 and §7 (no additions, no renames). Subagent must copy field-for-field.
```rust
// ── kebab-core ─────────────────────────────────────────────────────────────────
// Newtype IDs (design §3.1) — Display + FromStr implemented.
pub struct AssetId(pub String);
pub struct DocumentId(pub String);
pub struct BlockId(pub String);
pub struct ChunkId(pub String);
pub struct EmbeddingId(pub String);
pub struct IndexId(pub String);
// Versions / labels (§3.2)
pub struct ParserVersion(pub String);
pub struct ChunkerVersion(pub String);
pub struct EmbeddingModelId(pub String);
pub struct EmbeddingVersion(pub String);
pub struct IndexVersion(pub String);
pub struct PromptTemplateVersion(pub String);
pub struct SchemaVersion(pub &'static str);
// Forward-declared (§3.7a)
pub struct OcrText { /* per §3.7a */ }
pub struct OcrRegion { /* per §3.7a */ }
pub struct ModelCaption { /* per §3.7a */ }
pub struct Transcript { /* per §3.7a */ }
pub struct TranscriptSegment { /* per §3.7a */ }
pub struct Checksum(pub String);
pub struct Lang(pub String);
pub enum ImageType { Png, Jpeg, Webp, Gif, Tiff, Other(String) }
pub enum AudioType { M4a, Mp3, Wav, Flac, Ogg, Other(String) }
// RawAsset (§3.3)
pub struct RawAsset { /* per §3.3 */ }
pub enum SourceUri { File(std::path::PathBuf), Kb(String) }
pub struct WorkspacePath(pub String);
pub enum MediaType { Markdown, Pdf, Image(ImageType), Audio(AudioType), Other(String) }
pub enum AssetStorage { Copied { path: std::path::PathBuf }, Reference { path: std::path::PathBuf, sha: Checksum } }
// CanonicalDocument + Block + SourceSpan + Inline (§3.4)
pub struct CanonicalDocument { /* per §3.4 */ }
pub enum Block { /* per §3.4 */ }
pub struct CommonBlock { /* per §3.4 */ }
pub struct HeadingBlock { /* per §3.4 */ }
pub struct TextBlock { /* per §3.4 */ }
pub struct ListBlock { /* per §3.4 */ }
pub struct CodeBlock { /* per §3.4 */ }
pub struct TableBlock { /* per §3.4 */ }
pub struct ImageRefBlock { /* per §3.4 */ }
pub struct AudioRefBlock { /* per §3.4 */ }
pub enum Inline { /* per §3.4 */ }
pub enum SourceSpan { /* per §3.4 */ }
// (ParsedBlock + parser intermediates live in kebab-parse-types per design §3.7b — NOT in kebab-core.)
// Chunk + Citation (§3.5)
pub struct Chunk { /* per §3.5 */ }
pub enum Citation { /* 5 variants per §3.5 */ }
impl Citation {
pub fn path(&self) -> &WorkspacePath;
pub fn to_uri(&self) -> String; // W3C Media Fragments per §0 Q3
pub fn parse(s: &str) -> anyhow::Result<Self>;
}
// Metadata + Provenance (§3.6)
pub struct Metadata { /* per §3.6 */ }
pub enum SourceType { Markdown, Note, Paper, Reference, Inbox }
pub enum TrustLevel { Primary, Secondary, Generated }
pub struct Provenance { /* per §3.6 */ }
pub struct ProvenanceEvent { /* per §3.6 */ }
pub enum ProvenanceKind { Discovered, Parsed, Normalized, Chunked, OcrApplied, CaptionApplied, Transcribed, Embedded, Indexed, Warning, Error }
// Search types (§3.7)
pub enum SearchMode { Lexical, Vector, Hybrid }
pub struct SearchQuery { /* per §3.7 */ }
pub struct SearchFilters { /* per §3.7 */ }
pub struct SearchHit { /* per §3.7 */ }
pub struct RetrievalDetail { /* per §3.7 */ }
pub struct DocFilter { /* tags_any/lang/path_glob/trust_min */ }
pub struct DocSummary { /* per §2.5 wire — mirrored internally */ }
// Answer / RAG (§3.8)
pub struct Answer { /* per §3.8 */ }
pub struct AnswerCitation { /* per §3.8 */ }
pub enum RefusalReason { ScoreGate, LlmSelfJudge, NoIndex, NoChunks }
pub struct ModelRef { /* per §3.8 */ }
pub struct AnswerRetrievalSummary { /* per §3.8 */ }
pub struct TokenUsage { /* per §3.8 */ }
pub struct TraceId(pub String);
// IngestReport (mirrored from wire §2.4 for facade return)
pub struct IngestReport { /* per §2.4 */ }
pub struct IngestItem { /* per §2.4 items */ }
// JobRepo support types (forward-declared; full shapes can land here)
pub enum JobKind { Ingest, Chunk, Embed, Ocr, Transcribe, Reindex, Doctor }
pub enum JobStatus { Pending, Running, Succeeded, Failed, Canceled }
pub struct JobId(pub String);
pub struct JobFilter { /* status/kind */ }
pub struct JobRow { /* row mirror */ }
// Vector (forward-declared per §7.2)
pub struct VectorRecord { /* chunk_id, embedding_id, vector, doc_id, text, heading_path, model_id, model_version, dimensions */ }
pub struct VectorHit { /* chunk_id, score, payload */ }
// Errors (§10)
#[derive(Debug, thiserror::Error)]
pub enum CoreError {
#[error("invalid id: {0}")] InvalidId(String),
#[error("invalid citation: {0}")] InvalidCitation(String),
#[error("invalid source span: {0}")] InvalidSpan(String),
#[error("malformed input: {0}")] Malformed(String),
}
// ── Traits (§7.2) ───────────────────────────────────────────────────────────
pub trait SourceConnector { fn scan(&self, scope: &SourceScope) -> anyhow::Result<Vec<RawAsset>>; }
pub trait Extractor: Send + Sync {
fn supports(&self, media_type: &MediaType) -> bool;
fn parser_version(&self) -> ParserVersion;
fn extract(&self, ctx: &ExtractContext, bytes: &[u8]) -> anyhow::Result<CanonicalDocument>;
}
pub trait Chunker: Send + Sync {
fn chunker_version(&self) -> ChunkerVersion;
fn policy_hash(&self, policy: &ChunkPolicy) -> String;
fn chunk(&self, doc: &CanonicalDocument, policy: &ChunkPolicy) -> anyhow::Result<Vec<Chunk>>;
}
pub trait Embedder: Send + Sync {
fn model_id(&self) -> EmbeddingModelId;
fn model_version(&self) -> EmbeddingVersion;
fn dimensions(&self) -> usize;
fn embed(&self, inputs: &[EmbeddingInput<'_>]) -> anyhow::Result<Vec<Vec<f32>>>;
}
pub trait Retriever: Send + Sync {
fn search(&self, query: &SearchQuery) -> anyhow::Result<Vec<SearchHit>>;
fn index_version(&self) -> IndexVersion;
}
pub trait LanguageModel: Send + Sync {
fn model_ref(&self) -> ModelRef;
fn context_tokens(&self) -> usize;
fn generate_stream(&self, req: GenerateRequest)
-> anyhow::Result<Box<dyn Iterator<Item = anyhow::Result<TokenChunk>> + Send>>;
}
pub trait DocumentStore { /* full set per §7.2 */ }
pub trait VectorStore { /* full set per §7.2 */ }
pub trait JobRepo { /* full set per §7.2 */ }
// Helper input types (§7.1)
pub struct SourceScope { pub root: std::path::PathBuf, pub include: Vec<String>, pub exclude: Vec<String> }
pub struct ExtractContext<'a> { /* per §7.1 */ }
pub struct ExtractConfig { /* TBD by extractors; carry path-only for now */ }
pub struct ChunkPolicy { /* per §7.1 */ }
pub enum EmbeddingKind { Document, Query }
pub struct EmbeddingInput<'a> { pub text: &'a str, pub kind: EmbeddingKind }
pub struct GenerateRequest { /* per §7.1 */ }
pub enum TokenChunk { Token(String), Done { finish_reason: FinishReason, usage: TokenUsage } }
pub enum FinishReason { Stop, Length, Aborted, Error(String) }
// ── ID functions (§4.2) ─────────────────────────────────────────────────────
pub fn id_from<T: serde::Serialize>(tuple: T) -> String; // hex prefix 32
pub fn id_for_asset(asset_blake3_full_hex: &str) -> AssetId;
pub fn id_for_doc(workspace_path: &WorkspacePath, asset: &AssetId, parser_version: &ParserVersion) -> DocumentId;
pub fn id_for_block(doc: &DocumentId, block_kind: &str, heading_path: &[String], ordinal: u32, span: &SourceSpan) -> BlockId;
pub fn id_for_chunk(doc: &DocumentId, chunker_version: &ChunkerVersion, block_ids: &[BlockId], policy_hash: &str) -> ChunkId;
pub fn id_for_embedding(chunk: &ChunkId, model: &EmbeddingModelId, version: &EmbeddingVersion, dims: usize) -> EmbeddingId;
pub fn id_for_index(collection: &str, model: &EmbeddingModelId, dims: usize, version: &IndexVersion, kind: &str, params_hash: &str) -> IndexId;
pub fn to_posix(path: &std::path::Path) -> WorkspacePath; // §6.6
pub fn nfc(input: &str) -> String; // §4.1
```
```rust
// ── kebab-parse-types ──────────────────────────────────────────────────────────
// Per design §3.7b. Defines parser intermediate representations consumed by
// kebab-normalize. Depends on kebab-core only — never on parser libraries.
pub struct ParsedBlock {
pub kind: ParsedBlockKind,
pub heading_path: Vec<String>,
pub source_span: kebab_core::SourceSpan,
pub payload: ParsedPayload,
}
pub enum ParsedBlockKind { Heading, Paragraph, List, Code, Table, Quote, ImageRef, AudioRef }
pub enum ParsedPayload {
Heading { level: u8, text: String },
Paragraph { text: String, inlines: Vec<kebab_core::Inline> },
List { ordered: bool, items: Vec<Vec<kebab_core::Inline>> },
Code { lang: Option<String>, code: String },
Table { headers: Vec<String>, rows: Vec<Vec<String>> },
Quote { text: String, inlines: Vec<kebab_core::Inline> },
ImageRef { src: String, alt: String },
AudioRef { src: String },
}
// `Inline` itself lives in kebab-core (§3.4) — parse-types references it, never duplicates it.
pub struct Warning { pub kind: WarningKind, pub note: String }
pub enum WarningKind { MalformedFrontmatter, MalformedTable, EncodingFallback, ExtractFailed }
// Forward-ref for P6/P7/P8 — defined when those phases land.
pub struct ParsedImageRegion;
pub struct ParsedPdfPage;
pub struct ParsedAudioSegment;
```
```rust
// ── kebab-config ───────────────────────────────────────────────────────────────
pub struct Config { /* full schema per §6.4 */ }
impl Config {
pub fn load(path: Option<&std::path::Path>) -> anyhow::Result<Self>;
pub fn from_file(path: &std::path::Path) -> anyhow::Result<Self>;
pub fn defaults() -> Self;
pub fn apply_env(self, env: &std::collections::HashMap<String, String>) -> Self;
pub fn xdg_config_path() -> std::path::PathBuf; // ~/.config/kebab/config.toml
pub fn xdg_data_dir() -> std::path::PathBuf; // ~/.local/share/kebab
pub fn xdg_cache_dir() -> std::path::PathBuf;
pub fn xdg_state_dir() -> std::path::PathBuf;
}
```
```rust
// ── kebab-app ──────────────────────────────────────────────────────────────────
pub fn init_workspace(force: bool) -> anyhow::Result<()>;
pub fn ingest(scope: kebab_core::SourceScope, summary_only: bool) -> anyhow::Result<kebab_core::IngestReport>;
pub fn list_docs(filter: kebab_core::DocFilter) -> anyhow::Result<Vec<kebab_core::DocSummary>>;
pub fn inspect_doc(id: &kebab_core::DocumentId) -> anyhow::Result<kebab_core::CanonicalDocument>;
pub fn inspect_chunk(id: &kebab_core::ChunkId) -> anyhow::Result<kebab_core::Chunk>;
pub fn search(query: kebab_core::SearchQuery) -> anyhow::Result<Vec<kebab_core::SearchHit>>;
pub fn ask(query: &str, opts: AskOpts) -> anyhow::Result<kebab_core::Answer>;
pub fn doctor() -> anyhow::Result<DoctorReport>;
pub struct AskOpts { pub k: usize, pub explain: bool, pub mode: kebab_core::SearchMode, pub temperature: Option<f32>, pub seed: Option<u64> }
pub struct DoctorReport { pub ok: bool, pub checks: Vec<DoctorCheck> }
pub struct DoctorCheck { pub name: String, pub ok: bool, pub detail: String, pub hint: Option<String> }
```
P0 facade implementations call `anyhow::bail!("not yet wired (P<n>-<i>)")`; later phases replace bodies but never change signatures.
```rust
// ── kebab-cli ──────────────────────────────────────────────────────────────────
// clap subcommands: init | ingest | list (docs) | inspect (doc|chunk) | search | ask | doctor | eval (subcommand placeholder)
// Each maps 1:1 to a kebab_app function. Exit code mapping per §10.
```
## Behavior contract
- Workspace `Cargo.toml` sets `resolver = "3"`, `[workspace.package] edition = "2024"`, `rust-version = "1.85"`.
- Every newtype ID implements `Display` (returns inner) and `FromStr` (validates hex length 32).
- `id_from` uses `serde-json-canonicalizer` exactly as design §4.2 specifies and truncates blake3 to 32 hex chars.
- `Citation::to_uri` emits W3C Media Fragments URIs per §0 Q3 (`#L<a>-L<b>`, `#p=<n>`, `#xywh=…`, `#caption`, `#t=hh:mm:ss,hh:mm:ss[&speaker=…]`).
- `Citation::parse` is the strict inverse (round-trip property).
- `kebab-config` resolves XDG paths via `dirs` crate; respects `XDG_CONFIG_HOME`, `XDG_DATA_HOME`, `XDG_CACHE_HOME`, `XDG_STATE_HOME` if set.
- Config layer order: defaults → file → env (`KB_<SECTION>_<KEY>`) → CLI flag (CLI override is applied by `kebab-cli` after `Config::load`).
- `kebab-cli` global flags: `--config <path>`, `--verbose`, `--debug`, `--json`, `--explain` (where applicable). On `--json`, output conforms to wire schema v1.
- `kebab-cli` exit codes: 0 success, 1 no-hit/refusal, 2 error, 3 doctor unhealthy (per §10).
- All facade-returned wire objects emit `schema_version` per §2 (e.g., `"answer.v1"`, `"search_hit.v1"`).
## Storage / wire effects
- Filesystem: creates `~/.config/kebab/`, `~/.local/share/kebab/`, `~/KnowledgeBase/` only when `kebab init` runs; never on `Config::load`.
- Wire schemas: ships `docs/wire-schema/v1/{citation,search_hit,answer,ingest_report,doc_summary,chunk_inspection,doctor}.schema.json` as **stubs** declaring the top-level `schema_version` and required fields per §2. Full property validation can land later.
- DB: workspace ships `migrations/V001__init.sql` containing **only** §5.1 `schema_meta` + `migrations` tables (the full schema lands in p1-6's migration file or p0-1 may pre-stage the empty migrations directory; choose the former to keep this task within `kebab-core`/`kebab-config`/`kebab-app`/`kebab-cli` scope).
- Logging: `tracing` initialized in `kebab-cli`; daily-rolling file in `~/.local/state/kebab/logs/`.
## Test plan
| kind | description | fixture / data |
|------|-------------|----------------|
| unit | `id_from` deterministic across 1000 runs for fixed inputs | inline |
| unit | each `id_for_*` recipe matches design §4.2 byte-for-byte (verify against fixed expected hex) | inline |
| unit | `to_posix` collapses `./a//b.md``a/b.md` and NFC-normalizes Korean | inline |
| unit | `Citation::to_uri` and `parse` round-trip for all 5 variants | inline |
| unit | newtype `Display`/`FromStr` rejects invalid lengths/chars | inline |
| unit | `Config::defaults` + env override + CLI override produces expected merged config | inline |
| snapshot | `Config::defaults` JSON serde stable | inline (round-trip) |
| smoke | `kebab --help`, `kebab init`, `kebab doctor` run; doctor reports config_loaded ✓ data_dir_writable ✓ even with no DB present (downstream checks may fail with hint) | tmp `XDG_*` env |
| build | `cargo check --workspace` and `cargo test --workspace` pass | repo |
All tests must run with no network, no Ollama, no models.
## Definition of Done
- [ ] `Cargo.toml` workspace lists `kebab-core`, `kebab-parse-types`, `kebab-config`, `kebab-app`, `kebab-cli` and resolver=3, edition 2024
- [ ] `cargo check --workspace` passes
- [ ] `cargo test --workspace` passes
- [ ] `kebab-parse-types` `cargo tree` shows ONLY `kebab-core` + `serde`/`thiserror` style deps (no parser libs, no other `kebab-*`)
- [ ] `kebab --help` prints subcommands
- [ ] `kebab init` creates XDG dirs idempotently and writes `config.toml`
- [ ] `kebab doctor` returns wire JSON conforming to `doctor.v1` (in `--json` mode)
- [ ] `docs/wire-schema/v1/*.schema.json` stubs exist (7 files: citation, search_hit, answer, ingest_report, doc_summary, chunk_inspection, doctor)
- [ ] `docs/spec/` stubs exist linking to the frozen design (one file per: domain-model, ids, canonical-document, chunk-policy, citation-policy, module-boundaries, ai-generation-guidelines)
- [ ] `fixtures/` root directory created with all subdirectories that downstream tasks reference: `fixtures/markdown/`, `fixtures/source-fs/`, `fixtures/search/lexical/`, `fixtures/search/hybrid/`, `fixtures/embed/`, `fixtures/vector/`, `fixtures/rag/`, `fixtures/eval/`, `fixtures/image/`, `fixtures/pdf/`, `fixtures/audio/`. Each subdir gets a `.gitkeep` so it tracks. P1 ships at minimum `fixtures/markdown/{simple-note,nested-headings,code-and-table}.md` (per epic phase-0); other dirs stay empty until their phase lands.
- [ ] No imports outside Allowed dependencies (CI deny check)
- [ ] PR body links design §3, §3.7b, §4, §6, §7, §8, §9, §10
## Out of scope
- Any parser / store / embedder / search / llm / rag / tui / desktop logic (downstream phases).
- Full schema migrations (most DDL lands in p1-6 / p2-1 / p3-3).
- Wire schema deep validation (only required fields + `schema_version` checked here).
- Real `kebab-app` business logic (functions stub with `unimplemented!()` or explicit `bail!`).
## Risks / notes
- ID recipe is the contract that every later record depends on. Any change after this task lands forces a `parser_version` / `chunker_version` / `embedding_version` cascade per §9. Treat changes as schema migrations and update the design doc first.
- Newtype IDs use `String` (not `[u8; 16]`) to keep serde simple; tests must still enforce 32-char hex constraint on `FromStr`.
- `kebab-app` stubs must use `bail!` not `panic!` so the CLI exits with code 2 cleanly per §10.
- `clap` v4 derive: subcommand `inspect` has nested `doc` / `chunk` variants; ensure exit code 0/1/2 mapping wraps the facade call uniformly.
- XDG path discovery on macOS: spec uses XDG (not `Application Support`) per §6.1 — `dirs` crate honors XDG env vars; tests must set them explicitly.