kebab

Author	SHA1	Message	Date
altair823	e35b06d0d0	feat(p4-3): kb-rag crate — full RAG pipeline + kb-app::ask wired P4 terminal task. Implements the user-facing payoff: retrieve → score gate → pack → render → generate → cite-validate → persist. After this commit, `kb ask` actually works against an Ollama backend; the pipeline grounds the answer in retrieved chunks and refuses cleanly when the gate trips or the model self-judges. New crate kb-rag: - pub struct RagPipeline { retriever, llm, docs, config } — all Arc<dyn Trait + Send + Sync> so the pipeline shares + Sync. - pub fn ask(query, opts) -> Result<Answer> drives the nine-stage flow per spec §1. - pub struct AskOpts { k, explain, mode, temperature, seed, stream_sink: Option<mpsc::Sender<String>> }. k acts as a floor over config.search.default_k so a low-k caller can't starve retrieval (documented in field doc). Pipeline stages: 1. Retrieve via the injected dyn Retriever. 2. Score gate: empty hits → NoChunks refusal (no LLM call); top-1 < config.rag.score_gate → ScoreGate refusal (no LLM call) with top-3 candidates listed in the synthesized answer text. 3. Pack: budget = config.rag.max_context_tokens.saturating_sub (prompt overhead). Per-hit `[#n] doc=… heading=… span=…\n<text>` with deterministic enumeration. If every hit's chunk is unfetchable from the store (deleted between search and pack), fall back to NoChunks refusal with a tracing::warn rather than feeding an empty [근거] to the LLM. 4. Render rag-v1 prompt with the spec's verbatim Korean system string + `[질문]/[근거]` user template. 5. Generate via dyn LanguageModel. Single-thread token loop owns the iterator; tokens optionally forward to opts.stream_sink (a `mpsc::Sender<String>`). SendError silently dropped — caller cancellation never panics the pipeline. After Done the loop reads (acc, finish_reason, usage) in lockstep with no race. max_completion = llm.context_tokens().saturating_sub (used_for_input).max(64) — explicitly NOT capped by config.rag.max_context_tokens (that's the packing budget for [근거], not the LM completion ceiling). 6. Citation extract via STRICT regex `\[#(\d{1,3})\]` (compiled once via OnceLock). Loose forms `[1]`, `[ #1 ]`, `[#foo]`, `[#1234]`, `vec![1]` are all rejected to prevent prose false-positives. 7. Citation validate covers four cases: - unknown marker (e.g. `[#7]` when only 3 packed) → LlmSelfJudge refusal. - empty answer with hits → LlmSelfJudge. - non-empty + no marker + matches `근거 (가\|이) 부족` regex → LlmSelfJudge (model self-refused with the canonical phrase; phrase match logged via tracing::debug for observability). - non-empty + no marker + no refusal phrase → LlmSelfJudge (silent ungrounded answers are still refusals). - non-empty + ≥1 valid marker → grounded = true. 8. Build Answer per kb_core::Answer shape: - citations: filter packed list to exactly the markers cited. Wire format `marker: Some("[1]")` (square-bracketed bare index) per design §2.3, distinct from the prompt-side `[#n]` grammar. - embedding ModelRef: read from config.models.embedding for Vector/Hybrid; None for Lexical. Documented deviation since the Retriever trait doesn't expose the embedder. For ScoreGate/NoChunks refusals on Vector/Hybrid the embedding model is still recorded — the vector retriever WAS consulted even when the gate tripped. - TraceId minted as `ret_<8-hex>` from blake3(query, top_score, model_id, ns). - retrieval AnswerRetrievalSummary populated. - usage from the final Done chunk; latency_ms wall-clock fallback when the LLM reports zero. - created_at OffsetDateTime::now_utc(). 9. Persist via SqliteStore::put_answer (new inherent method on SqliteStore, not on the DocumentStore trait — answers aren't documents and adding to kb-core was forbidden). Always inserts, refusals included. packed_chunks_json is null unless opts.explain == true. kb-store-sqlite extension: - pub fn put_answer(&Answer, query, packed_chunks_json) -> Result<AnswerId>. Maps all 22 fields of the answers table per V001 schema in a single INSERT under a transaction. kb-app::ask wired: - bail!("not yet wired (P4-3)") replaced with a real body that builds the retriever per opts.mode (Lexical \| Vector \| Hybrid), instantiates OllamaLanguageModel from config, constructs RagPipeline, calls pipeline.ask. AskOpts moves to kb-rag and is re-exported via `pub use kb_rag::AskOpts` so kb-cli's `use kb_app::AskOpts` keeps working. - kb-app/Cargo.toml gains kb-rag, kb-llm, kb-llm-local. P3-5's forbids on these are lifted by P4-3 spec — kb-app is the orchestrator and ask requires both the trait crate and the Ollama adapter. - kb-cli/main.rs's AskOpts literal updated with stream_sink: None for the CLI path (TUI in P9 will plumb a real sink). Tests (kb-rag: 18; kb-app: 1 ignored): - 3 unit in src/pipeline.rs: marker regex strictness (rejects all loose forms with byte-equal expectations), Send+Sync compile check, embedding_ref_for behavior across modes. - 15 integration in tests/pipeline.rs covering every spec test row + the new "all chunks unfetchable falls back to NoChunks" guard: empty-hits, score-gate, grounded happy path, unknown-marker, prose-`[1]` rejection, `vec![1]` rejection, refusal-phrase, packing-budget overflow, streaming-forwards-to-mpsc, dropped- receiver-no-panic, usage-from-final-Done, answers-row-inserted- for-each-refusal-kind, determinism temp=0 seed=0, Answer JSON shape, unfetchable-chunks-fall-back-to-no-chunks (the new M3 test). - kb-app/tests/ask_smoke.rs: 1 #[ignore]'d real-Ollama smoke that drives the wired ask end-to-end against `localhost:11434`. Workspace: 319 passed / 26 ignored / 0 failed. cargo clippy --workspace --all-targets -- -D warnings clean. Allowed deps respected (kb-core, kb-config, kb-search, kb-llm, kb-store-sqlite, serde, serde_json, regex, time, tracing, thiserror) plus forced waivers anyhow (Retriever / LanguageModel trait return types) and blake3 (TraceId minting). Forbidden (kb-source-fs, kb-parse-md, kb-normalize, kb-chunk, kb-store- vector direct, kb-embed* direct, kb-llm-local direct, kb-tui, kb-desktop) all absent from `cargo tree -p kb-rag` — concrete adapters reach the pipeline only through trait objects. Out of scope: reranker between retrieve and pack (P+), multi-turn chat memory (P+), LLM-as-judge eval (P5 uses rule-based must_contain), --json streaming (buffers per §0 Q5 hybrid). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 15:06:10 +00:00
altair823	a08f61a242	fix(cli): honor --config flag + improve search output legibility Two issues surfaced during the post-P3-5 manual smoke test against a six-document workspace: 1. --config flag was silently ignored. kb-cli read cli.config only while building SourceScope inside the Ingest arm, then called kb_app::ingest(scope, summary_only) which internally re-loads Config::load(None) — falling back to ~/.config/kb/config.toml regardless of what the user passed. Same pattern in search, list, inspect, doctor. Users had to rely on KB_* env vars to point at a non-default config. 2. Search output collapsed RRF hybrid scores to "0.02" because `{:.2}` truncated the (0, 0.033]-bounded fused score, and chunks from the same document showed up as identical lines ("3. 0.02 arch/rag-architecture.md") since heading_path was never printed. Fix: - kb-app: doctor/ingest/search/list/inspect already had _with_config(Config, ...) seams introduced for integration tests (#[doc(hidden)] pub). Repurpose them as the official "config-explicit" API — kb-cli now builds the Config once via Config::load(cli.config.as_deref()) at the top of every subcommand and threads it into the _with_config variant. Module doc-comment updated to reflect three callers (CLI --config, integration tests, TUI session) instead of "test-only seam". - kb-app: doctor() rewritten as doctor_with_config_path(Option<&Path>) that respects an explicit path. config_loaded probe now reports the actual path checked, returning a clear hard error if --config points at a non-existent or malformed file (defaults would silently mask user intent). data_dir_writable resolves storage.data_dir from the loaded config (with env overrides applied via Config::apply_env) so --config users see their custom paths reflected. Original doctor() signature kept as a None-passing wrapper. - kb-cli: ingest/search/list/inspect/doctor each call the _with_config companion. Search printer switches to {:.4} score formatting (RRF hybrid range bounded by ~2/k_rrf ≈ 0.033 at k_rrf=60 default) and appends `> head1 / head2` when heading_path is non- empty so chunks from the same document are visually distinguishable. Verified manually: - `kb --config /tmp/kb-smoke/config.toml doctor` reports the custom config path + custom data_dir, not the XDG defaults. - `kb --config /tmp/kb-smoke/config.toml search "..." --mode hybrid` returns hits with distinct 4-digit scores and heading paths ("rust/ownership.md > Rust 소유권 모델 / Borrow checker"). Workspace 269 passed / 24 ignored / 0 failed; cargo clippy --workspace --all-targets -- -D warnings clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 12:46:37 +00:00
altair823	d91b60325e	p0-1: address review (apply_env full schema map, drop dead Option in logging::init) - kb-config::apply_env now covers every leaf key in `Config` via an explicit grep-friendly match block (one arm per leaf), keyed `KB_<SECTION>_<KEY>`. Booleans flow through a shared `parse_bool` helper. Numeric leaves silently keep their prior value on parse failure so a malformed env entry can't crash startup. - New tests: env_unknown_key_is_ignored, env_overrides_chunking_target_tokens, env_overrides_models_llm_endpoint_and_temperature, env_overrides_indexing_watch_filesystem_bool. - kb-app::logging::init now returns `Result<WorkerGuard>` instead of `Result<Option<WorkerGuard>>` — the inner `Option` was always `Some` so the wrapper was dead. kb-cli/main.rs collapses the call from `.ok().flatten()` to `.ok()`, preserving fail-soft semantics on logging init. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 08:53:59 +00:00
altair823	5af07c174d	p0-1: address quality review (wire convention, IngestItemKind re-export, clippy) Three follow-ups from the code-quality review pass on P0-1: - Re-export `IngestItemKind` from `kb-core` so downstream tasks constructing `IngestItem` don't need `kb_core::ingest::IngestItemKind`. - Document the `--json` wire-schema convention by introducing `kb-cli/src/wire.rs` with `wire_*` helpers paralleling the existing inline `wire_ingest`. Each Ok-path `--json` branch now routes through these helpers so future P1-5/P3/P4/P5 implementations slot the `schema_version` envelope in automatically. `DoctorReport` keeps its struct-field `schema_version` (the documented exception), and the helper round-trips it idempotently. Records the convention in `kb-app/src/lib.rs`'s top docstring. - Fix clippy `single_char_add_str` in `kb_core::normalize` (replace `out.push_str(".")` with `out.push('.')`). Verified: `cargo check`, `cargo test` (5 new wire-helper tests), `cargo clippy -D warnings`, and `RUSTFLAGS=-D warnings cargo build` all clean. Smoke-tested `kb doctor --json` still emits `{"schema_version":"doctor.v1",...}`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 05:33:31 +00:00
altair823	ec8a4ddb1b	p0-1: kb-cli clap entry with §10 exit-code mapping Adds the kb binary with clap v4 derive subcommands mapping 1:1 to kb-app facade functions: init \| ingest \| list docs \| inspect (doc\|chunk) \| search \| ask \| doctor \| eval run Global flags: --config, --verbose, --debug, --json. On --json, output conforms to wire schema v1 (e.g. doctor.v1 emitted by kb-app::doctor). Exit-code mapping per design §10: 0 success 1 RefusalSignal / NoHitSignal (kb ask refusal, kb search no-hit) 2 any other anyhow::Error 3 DoctorUnhealthy Tracing initialized at startup with the file appender from kb-app. Verified via: XDG_=… cargo run -p kb-cli -- init → idempotent XDG_=… cargo run -p kb-cli -- doctor --json → {"schema_version":"doctor.v1","ok":true,…} exit 0 XDG_*=… cargo run -p kb-cli -- doctor (human form, ✓ marks) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 05:17:18 +00:00

5 Commits