feat(p4-1): llm-trait — kb-llm 크레이트 + MockLanguageModel #21

Merged

altair823 merged 1 commits from feat/p4-1-llm-trait into main

2026-05-01 13:45:19 +00:00

Author	SHA1	Message	Date
altair823	27c669fbf9	feat(p4-1): kb-llm crate — LanguageModel trait re-export + MockLanguageModel Establishes the kb-llm trait crate so concrete LLM adapters (p4-2 Ollama, future llama.cpp / candle) target a stable surface. Pure re- export of kb_core::{LanguageModel, GenerateRequest, TokenChunk, FinishReason, TokenUsage, ModelRef} plus a feature-gated deterministic mock for downstream RAG tests (p4-3) that need an LLM trait object without an Ollama dependency. MockLanguageModel (cfg(feature = "mock"), default OFF): - Holds canned_response + canned_finish + canned_usage + (model_id, provider, context_tokens). Pure in-memory; no I/O. - generate_stream() honors GenerateRequest.stop: scans every non-empty stop string against the canned response, takes the earliest byte position (Iterator::min returns the first equal element on ties so declaration order in req.stop wins), truncates with a direct byte- slice (str::find returns a UTF-8 char boundary by contract). - When a stop matches, finish_reason is overridden to Stop (matches OpenAI / Ollama real-world behaviour); otherwise the caller's canned_finish passes through verbatim. - Emits one TokenChunk::Token per Unicode scalar value (char), NOT per grapheme cluster — Hangul jamo, emoji ZWJ sequences, combining marks split. Acceptable for trait-shape testing; real adapters MAY combine. Documented in module docs. - Always terminates with TokenChunk::Done { finish_reason, usage } even if the canned response is empty. The returned iterator is a boxed Vec<TokenChunk>::into_iter().map(Ok), trivially Send. - Real adapters MAY return Err from generate_stream itself (e.g. connection refused) before any chunk is yielded; the mock never does. Documented for the trait re-exporter consumer audience. Helpers: - assert_finish_chunk(chunks) — asserts the last chunk is a Done. Useful for proptests asserting trait contract over random inputs. Tests: - cargo test -p kb-llm (no features): 2 reexport / dyn-dispatch tests. - cargo test -p kb-llm --features mock: 9 tests including 100-case proptest over random Unicode strings asserting Done terminator, char-count == streamed Token chunks, concat == canned (truncated by stop), plus explicit cases for stop-string truncation, first-stop- match precedence, model_ref dimensions=None invariant, finish reason pass-through. - All 271 workspace tests pass; clippy clean for both default and mock-on feature configurations. Symbol gating verified: - cargo build --release -p kb-llm (default): nm shows zero MockLanguageModel symbols. - cargo build --release -p kb-llm --features mock: three trait-impl symbols present. Spec invariant "release builds MUST NOT include MockLanguageModel" enforced at the symbol level. Allowed deps respected: only kb-core (path) and anyhow (workspace, forced by trait return type). Dropped kb-config / serde / thiserror / tracing from the spec's allowed list — they are listed as Allowed but nothing in this skeleton crate references them, and dropping them keeps the dependency graph slim for downstream consumers. p4-2/p4-3 will add what they need at their own dep sites. Forbidden deps (reqwest, ureq, tokio, whisper-rs, kb-source-fs, kb-parse-md, kb-normalize, kb-chunk, kb-store-, kb-embed, kb-search, kb-rag, kb-tui, kb-desktop) all absent from cargo tree -p kb-llm. Out of scope: real adapter (p4-2 Ollama), token counting against the real tokenizer, server-side cancellation / abort signals (P+). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 13:37:46 +00:00

Author

SHA1

Message

Date

altair823

27c669fbf9

feat(p4-1): kb-llm crate — LanguageModel trait re-export + MockLanguageModel

Establishes the kb-llm trait crate so concrete LLM adapters (p4-2
Ollama, future llama.cpp / candle) target a stable surface. Pure re-
export of kb_core::{LanguageModel, GenerateRequest, TokenChunk,
FinishReason, TokenUsage, ModelRef} plus a feature-gated deterministic
mock for downstream RAG tests (p4-3) that need an LLM trait object
without an Ollama dependency.

MockLanguageModel (cfg(feature = "mock"), default OFF):
- Holds canned_response + canned_finish + canned_usage + (model_id,
  provider, context_tokens). Pure in-memory; no I/O.
- generate_stream() honors GenerateRequest.stop: scans every non-empty
  stop string against the canned response, takes the earliest byte
  position (Iterator::min returns the first equal element on ties so
  declaration order in req.stop wins), truncates with a direct byte-
  slice (str::find returns a UTF-8 char boundary by contract).
- When a stop matches, finish_reason is overridden to Stop (matches
  OpenAI / Ollama real-world behaviour); otherwise the caller's
  canned_finish passes through verbatim.
- Emits one TokenChunk::Token per Unicode scalar value (char), NOT per
  grapheme cluster — Hangul jamo, emoji ZWJ sequences, combining
  marks split. Acceptable for trait-shape testing; real adapters MAY
  combine. Documented in module docs.
- Always terminates with TokenChunk::Done { finish_reason, usage } even
  if the canned response is empty. The returned iterator is a boxed
  Vec<TokenChunk>::into_iter().map(Ok), trivially Send.
- Real adapters MAY return Err from generate_stream itself (e.g.
  connection refused) before any chunk is yielded; the mock never does.
  Documented for the trait re-exporter consumer audience.

Helpers:
- assert_finish_chunk(chunks) — asserts the last chunk is a Done.
  Useful for proptests asserting trait contract over random inputs.

Tests:
- cargo test -p kb-llm (no features): 2 reexport / dyn-dispatch tests.
- cargo test -p kb-llm --features mock: 9 tests including 100-case
  proptest over random Unicode strings asserting Done terminator,
  char-count == streamed Token chunks, concat == canned (truncated by
  stop), plus explicit cases for stop-string truncation, first-stop-
  match precedence, model_ref dimensions=None invariant, finish reason
  pass-through.
- All 271 workspace tests pass; clippy clean for both default and
  mock-on feature configurations.

Symbol gating verified:
- cargo build --release -p kb-llm (default): nm shows zero
  MockLanguageModel symbols.
- cargo build --release -p kb-llm --features mock: three trait-impl
  symbols present. Spec invariant "release builds MUST NOT include
  MockLanguageModel" enforced at the symbol level.

Allowed deps respected: only kb-core (path) and anyhow (workspace,
forced by trait return type). Dropped kb-config / serde / thiserror /
tracing from the spec's allowed list — they are listed as Allowed but
nothing in this skeleton crate references them, and dropping them
keeps the dependency graph slim for downstream consumers. p4-2/p4-3
will add what they need at their own dep sites.

Forbidden deps (reqwest, ureq, tokio, whisper-rs, kb-source-fs,
kb-parse-md, kb-normalize, kb-chunk, kb-store-*, kb-embed*, kb-search,
kb-rag, kb-tui, kb-desktop) all absent from cargo tree -p kb-llm.

Out of scope: real adapter (p4-2 Ollama), token counting against the
real tokenizer, server-side cancellation / abort signals (P+).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-01 13:37:46 +00:00

feat(p4-1): llm-trait — kb-llm 크레이트 + MockLanguageModel #21

1 Commits