PR #1 review left a design-debt note: ParsedBlock landing in kb-core would
(a) force every crate to recompile on parser-internal changes, and
(b) cause namespace pollution when P6/P7/P8 parsers add their own variants.
Resolution: a new thin crate kb-parse-types sits between kb-core and parsers.
Owns ParsedBlock + ParsedPayload + Warning + forward-refs for image/pdf/audio
parser intermediates. Depends on kb-core only (for SourceSpan / Inline).
Updates:
- design §3.7b: add new section defining kb-parse-types
- design §8: add kb-parse-types to module-boundary diagram + forbidden list
- design §3.4 Inline stays in kb-core; kb-parse-types references it (no duplication)
- p0-1 skeleton: workspace + Cargo deps + public surface block
- p1-3 parse-md-blocks: outputs Vec<kb_parse_types::ParsedBlock> directly
- p1-4 normalize: Allowed gains kb-parse-types, drops cross-coupling note
- INDEX + phase-0 epic: list kb-parse-types in P0 deliverables
Stand up the Cargo workspace (Rust 2024, resolver=3) with kb-core, kb-parse-types, kb-config, kb-app, kb-cli crates. Freeze every domain type, trait, ID recipe, error type, and CLI entry shape per the frozen design doc so that all subsequent component tasks compile against stable contracts.
Why now / why this size
Every other task imports kb-core. If types or trait signatures wobble after this point, every downstream task spec drifts. This task is large but indivisible: types + traits + ID recipe + facade + CLI skeleton + wire schema stubs must land together so the rest of the workspace can compile against them.
Allowed dependencies
workspace [workspace.dependencies]: anyhow = "1", thiserror = "2", serde = { version = "1", features = ["derive"] }, serde_json = "1", time = { version = "0.3", features = ["serde", "macros"] }, uuid = { version = "1", features = ["v7", "serde"] }, blake3 = "1", tracing = "0.1"
kb-cli: workspace deps + kb-core, kb-config, kb-app, clap = { version = "4", features = ["derive"] }
Forbidden dependencies
kb-core MUST NOT depend on any other kb-* crate.
kb-parse-types MUST depend ONLY on kb-core. No parser libraries (pulldown-cmark, pdf-extract, image, whisper-rs, …), no other kb-* crate.
kb-config MUST NOT depend on kb-app, kb-cli, parsers, stores, embedders, search, llm, rag, tui, desktop.
kb-app MUST NOT yet depend on parsers/stores/embedders/search/llm/rag (those crates do not exist yet — facade methods stub out and return unimplemented!() or anyhow::bail!("not yet wired (Pn-i)")).
kb-cli MUST NOT call any non-kb-app crate directly.
All facade-returned wire objects emit schema_version per §2 (e.g., "answer.v1", "search_hit.v1").
Storage / wire effects
Filesystem: creates ~/.config/kb/, ~/.local/share/kb/, ~/KnowledgeBase/ only when kb init runs; never on Config::load.
Wire schemas: ships docs/wire-schema/v1/{citation,search_hit,answer,ingest_report,doc_summary,chunk_inspection,doctor}.schema.json as stubs declaring the top-level schema_version and required fields per §2. Full property validation can land later.
DB: workspace ships migrations/V001__init.sql containing only §5.1 schema_meta + migrations tables (the full schema lands in p1-6's migration file or p0-1 may pre-stage the empty migrations directory; choose the former to keep this task within kb-core/kb-config/kb-app/kb-cli scope).
Logging: tracing initialized in kb-cli; daily-rolling file in ~/.local/state/kb/logs/.
Test plan
kind
description
fixture / data
unit
id_from deterministic across 1000 runs for fixed inputs
inline
unit
each id_for_* recipe matches design §4.2 byte-for-byte (verify against fixed expected hex)
inline
unit
to_posix collapses ./a//b.md → a/b.md and NFC-normalizes Korean
inline
unit
Citation::to_uri and parse round-trip for all 5 variants
Real kb-app business logic (functions stub with unimplemented!() or explicit bail!).
Risks / notes
ID recipe is the contract that every later record depends on. Any change after this task lands forces a parser_version / chunker_version / embedding_version cascade per §9. Treat changes as schema migrations and update the design doc first.
Newtype IDs use String (not [u8; 16]) to keep serde simple; tests must still enforce 32-char hex constraint on FromStr.
kb-app stubs must use bail! not panic! so the CLI exits with code 2 cleanly per §10.