Address 8 issues found in spec audit (post PR #2):
1. §refs label: distinguish design vs report sections in p3-1 / p3-2 / p4-2 /
p9-1 / p9-5 contract_sections (e.g., "report §11.2 Ollama" not "§11.2").
2. mock feature gate: gate MockEmbedder (p3-1) and MockLanguageModel (p4-1)
behind `mock` cargo feature, default OFF; add CI symbol-scan as DoD item.
3. Warning type unification: p1-2 frontmatter now emits
`kb_parse_types::Warning` (matches p1-3 / p1-4); drops crate-internal type.
4. p4-3 streaming thread: explicitly single-threaded inside RagPipeline::ask;
collection + sink.send share the calling thread, no race. UI concurrency
is callers responsibility (TUI worker thread pattern in p9-3).
5. p6-2 tesseract version: noted that `tesseract` 0.13 has no stable Rust
`version()` accessor; use TessVersion FFI or shell-out + cache approach.
6. p9-* App struct extensions: introduce `kb_tui::{Library,Search,Ask,Inspect}State`
slots in p9-1 forward-decl form; p9-2/3/4 fill bodies in their own crate
without editing `App`. Parallel-safety contract added.
7. p3-3 cosine score: shift `(sim+1)/2` instead of clamp; preserve ranking
signal between unrelated and opposite vectors. Clamp reserved for NaN.
8. fixtures/ root: p0-1 DoD now creates all fixture subdirs with .gitkeep so
downstream tasks have a stable target path.
Stand up the Cargo workspace (Rust 2024, resolver=3) with kb-core, kb-parse-types, kb-config, kb-app, kb-cli crates. Freeze every domain type, trait, ID recipe, error type, and CLI entry shape per the frozen design doc so that all subsequent component tasks compile against stable contracts.
Why now / why this size
Every other task imports kb-core. If types or trait signatures wobble after this point, every downstream task spec drifts. This task is large but indivisible: types + traits + ID recipe + facade + CLI skeleton + wire schema stubs must land together so the rest of the workspace can compile against them.
Allowed dependencies
workspace [workspace.dependencies]: anyhow = "1", thiserror = "2", serde = { version = "1", features = ["derive"] }, serde_json = "1", time = { version = "0.3", features = ["serde", "macros"] }, uuid = { version = "1", features = ["v7", "serde"] }, blake3 = "1", tracing = "0.1"
kb-cli: workspace deps + kb-core, kb-config, kb-app, clap = { version = "4", features = ["derive"] }
Forbidden dependencies
kb-core MUST NOT depend on any other kb-* crate.
kb-parse-types MUST depend ONLY on kb-core. No parser libraries (pulldown-cmark, pdf-extract, image, whisper-rs, …), no other kb-* crate.
kb-config MUST NOT depend on kb-app, kb-cli, parsers, stores, embedders, search, llm, rag, tui, desktop.
kb-app MUST NOT yet depend on parsers/stores/embedders/search/llm/rag (those crates do not exist yet — facade methods stub out and return unimplemented!() or anyhow::bail!("not yet wired (Pn-i)")).
kb-cli MUST NOT call any non-kb-app crate directly.
All facade-returned wire objects emit schema_version per §2 (e.g., "answer.v1", "search_hit.v1").
Storage / wire effects
Filesystem: creates ~/.config/kb/, ~/.local/share/kb/, ~/KnowledgeBase/ only when kb init runs; never on Config::load.
Wire schemas: ships docs/wire-schema/v1/{citation,search_hit,answer,ingest_report,doc_summary,chunk_inspection,doctor}.schema.json as stubs declaring the top-level schema_version and required fields per §2. Full property validation can land later.
DB: workspace ships migrations/V001__init.sql containing only §5.1 schema_meta + migrations tables (the full schema lands in p1-6's migration file or p0-1 may pre-stage the empty migrations directory; choose the former to keep this task within kb-core/kb-config/kb-app/kb-cli scope).
Logging: tracing initialized in kb-cli; daily-rolling file in ~/.local/state/kb/logs/.
Test plan
kind
description
fixture / data
unit
id_from deterministic across 1000 runs for fixed inputs
inline
unit
each id_for_* recipe matches design §4.2 byte-for-byte (verify against fixed expected hex)
inline
unit
to_posix collapses ./a//b.md → a/b.md and NFC-normalizes Korean
inline
unit
Citation::to_uri and parse round-trip for all 5 variants
docs/spec/ stubs exist linking to the frozen design (one file per: domain-model, ids, canonical-document, chunk-policy, citation-policy, module-boundaries, ai-generation-guidelines)
fixtures/ root directory created with all subdirectories that downstream tasks reference: fixtures/markdown/, fixtures/source-fs/, fixtures/search/lexical/, fixtures/search/hybrid/, fixtures/embed/, fixtures/vector/, fixtures/rag/, fixtures/eval/, fixtures/image/, fixtures/pdf/, fixtures/audio/. Each subdir gets a .gitkeep so it tracks. P1 ships at minimum fixtures/markdown/{simple-note,nested-headings,code-and-table}.md (per epic phase-0); other dirs stay empty until their phase lands.
No imports outside Allowed dependencies (CI deny check)
Real kb-app business logic (functions stub with unimplemented!() or explicit bail!).
Risks / notes
ID recipe is the contract that every later record depends on. Any change after this task lands forces a parser_version / chunker_version / embedding_version cascade per §9. Treat changes as schema migrations and update the design doc first.
Newtype IDs use String (not [u8; 16]) to keep serde simple; tests must still enforce 32-char hex constraint on FromStr.
kb-app stubs must use bail! not panic! so the CLI exits with code 2 cleanly per §10.