Files

kb bc1b3147cd refactor(spec): cleanup pass over component specs

Address 8 issues found in spec audit (post PR #2):

1. §refs label: distinguish design vs report sections in p3-1 / p3-2 / p4-2 /
   p9-1 / p9-5 contract_sections (e.g., "report §11.2 Ollama" not "§11.2").
2. mock feature gate: gate MockEmbedder (p3-1) and MockLanguageModel (p4-1)
   behind `mock` cargo feature, default OFF; add CI symbol-scan as DoD item.
3. Warning type unification: p1-2 frontmatter now emits
   `kb_parse_types::Warning` (matches p1-3 / p1-4); drops crate-internal type.
4. p4-3 streaming thread: explicitly single-threaded inside RagPipeline::ask;
   collection + sink.send share the calling thread, no race. UI concurrency
   is callers responsibility (TUI worker thread pattern in p9-3).
5. p6-2 tesseract version: noted that `tesseract` 0.13 has no stable Rust
   `version()` accessor; use TessVersion FFI or shell-out + cache approach.
6. p9-* App struct extensions: introduce `kb_tui::{Library,Search,Ask,Inspect}State`
   slots in p9-1 forward-decl form; p9-2/3/4 fill bodies in their own crate
   without editing `App`. Parallel-safety contract added.
7. p3-3 cosine score: shift `(sim+1)/2` instead of clamp; preserve ranking
   signal between unrelated and opposite vectors. Clamp reserved for NaN.
8. fixtures/ root: p0-1 DoD now creates all fixture subdirs with .gitkeep so
   downstream tasks have a stable target path.

2026-04-27 23:38:13 +00:00

4.1 KiB

Raw Blame History

phase, component, task_id, title, status, depends_on, unblocks, contract_source, contract_sections

phase

component

task_id

title

status

depends_on

unblocks

contract_source

contract_sections

kb-llm (trait crate)

p4-1

LanguageModel trait + GenerateRequest/TokenChunk

planned

p0-1

p4-2

p4-3

../../docs/superpowers/specs/2026-04-27-kb-final-form-design.md

§7.1 GenerateRequest/TokenChunk

§7.2 LanguageModel

§0 Q5 streaming

§3.8 ModelRef

p4-1 — LanguageModel trait crate

Goal

Provide the kb-llm crate that re-exports the LanguageModel trait and helper types (GenerateRequest, TokenChunk, FinishReason, TokenUsage, ModelRef), plus a MockLanguageModel for downstream tests.

Why now / why this size

kb-rag (p4-3) consumes a LanguageModel trait object. Owning the trait + a deterministic mock here lets RAG tests run with no Ollama dependency. Real adapters (Ollama, llama.cpp, candle) live in p4-2 and beyond.

Allowed dependencies

kb-core
kb-config
serde
thiserror
tracing
[features] mock = [] — opt-in feature flag exposing MockLanguageModel. Default OFF. Release builds compile mock out entirely.

Forbidden dependencies

reqwest, ureq, tokio, whisper-rs, kb-source-fs, kb-parse-md, kb-normalize, kb-chunk, kb-store-*, kb-embed*, kb-search, kb-rag, kb-tui, kb-desktop

Inputs

input	type	source
`GenerateRequest`	`kb_core::GenerateRequest`	RAG pipeline
concrete adapter at runtime	`dyn LanguageModel`	p4-2+

Outputs

output	type	downstream
streaming `TokenChunk` iterator	`Box<dyn Iterator<Item=anyhow::Result<TokenChunk>> + Send>`	RAG pipeline
`ModelRef` identity	`kb_core::ModelRef`	Answer.model

Public surface (signatures only — no new types)

pub use kb_core::{LanguageModel, GenerateRequest, TokenChunk, FinishReason, TokenUsage, ModelRef};

/// Test-only deterministic mock. Compiled only when `mock` feature is on.
#[cfg(feature = "mock")]
pub struct MockLanguageModel {
    pub model_id: String,
    pub provider: String,
    pub context_tokens: usize,
    pub canned_response: String,                 // emitted token-by-token
    pub canned_finish: kb_core::FinishReason,
    pub canned_usage:  kb_core::TokenUsage,
}

#[cfg(feature = "mock")]
impl kb_core::LanguageModel for MockLanguageModel { /* per §7.2 */ }

Behavior contract

MockLanguageModel::generate_stream produces a Box<dyn Iterator> that yields the canned response one Unicode character at a time as TokenChunk::Token, then a final TokenChunk::Done { finish_reason, usage }.
The mock honors GenerateRequest.stop: if any stop string appears in the canned response, truncate before emitting.
model_ref() returns ModelRef { id, provider, dimensions: None }.
The mock must NOT touch the network or filesystem.
Real adapters (p4-2+) MUST NOT live in this crate.

Storage / wire effects

None.

Test plan

kind	description	fixture / data
unit	mock streams 5 tokens then `Done`	inline
unit	mock honors stop strings	inline
unit	trait dyn dispatch via `Box<dyn LanguageModel>` works	inline
unit	concatenation of streamed `TokenChunk::Token` equals canned text (truncated by stop strings)	inline
contract	`model_ref()` populates `provider` and leaves `dimensions = None`	inline

All tests under cargo test -p kb-llm.

Definition of Done

cargo check -p kb-llm passes
cargo test -p kb-llm passes
No HTTP / async runtime deps present
PR links design §7.2 LanguageModel, §0 Q5

Out of scope

Real adapter (p4-2).
Token counting against the actual tokenizer (best-effort via usage.prompt_tokens reported by the adapter).
Server-side cancellation / abort signals (P+).

Risks / notes

Real adapters return Unicode-incomplete byte sequences mid-stream; the trait emits TokenChunk::Token(String) so adapters must handle UTF-8 boundary buffering internally.
TokenChunk::Done { usage } must always fire, even on error — adapters convert errors into FinishReason::Error(msg) and a final Done.

4.1 KiB Raw Blame History