--- title: "v0.20.0 sub-item 1 bugfix round 3 — final-dogfood findings" created: 2026-05-27 status: DRAFT round: 1c parent_spec: 2026-04-27-kebab-final-form-design.md contract_sections: - "1.1 (ask streaming)" - "2.2 (error handling)" - "2.4 (JSON wire schema)" - "3.1 (config XDG)" - "4.1 (capabilities schema)" source_report: .omc/reviews/2026-05-27-v0.20-final-dogfood-report.md --- # v0.20.0 sub-item 1 bugfix round 3 — final-dogfood findings Post-bugfix2 final dogfood (2026-05-27) 에서 발견된 **5개 bug** 의 fix design. PR #189 force-update (base=main). Spec scope: root cause + fix decision + acceptance criteria + parent spec traceability. Bug #12 falsified (scope 외). Fix 5개 모두 trivial ~ small refactor (기존 1350 test + 추가 5+ test). --- ## §1 Problem statement ### §1.1 Bug #9: capabilities false negative (Critical) `kebab schema --json` 의 `capabilities.streaming_ask` 와 `capabilities.single_file_ingest` 가 모두 `false` hardcoded. 그러나 실제 구현: - `kebab ask --stream` → `answer_event.v1` ndjson events 정상 emit (191 events 검증). - `kebab ingest-file ` → `ingest_report.v1` 신규/갱신 정상. - `kebab ingest-stdin --title ` → 정상. **Impact**: MCP host, Claude Code skill 등 agent 가 `capabilities: { streaming_ask: false, single_file_ingest: false }` 보고 routing 결정 시 false negative. user 가 실제 동작하는 feature 를 사용 불가능하다고 오인. ### §1.2 Bug #10: config fail-fast (UX) ```bash kebab search "rust" --config /tmp/nonexistent.toml --json # exit=0, {"hits":[],"schema_version":"search_response.v1"} ``` explicit path 가 missing 시 silent fallback to default config (XDG path). debugging nightmare — typo 또는 wrong path 가 0 hit 으로만 surface. ### §1.3 Bug #11: OCR timeout 600s (Critical UX) `config.pdf.ocr.request_timeout_secs = 600` (10분/page default). metro-korea.pdf dogfood 증거: - page 8 + page 13 에서 Ollama remote 의 slow response → 600s 완전 timeout. - 결과: `ms: 600000, chars: 0, skipped: true` emit → 본문 indexed 안 됨 + 20분 cost waste. **Production impact**: 사용자가 ingest 완료 signal 못 받음, 일부 page 검색 불가. ### §1.4 Bug #13: schema.models single value (UX) ```json { "chunker_version": "md-heading-v1", "parser_version": "md-frontmatter-v2", ... } ``` 그러나 corpus 안 multi-active: - parsers: `md-frontmatter-v2`, `pdf-text-v1`, `code-rust-v1`, `code-python-v1`, `none-v1`. - chunkers: `md-heading-v1`, `pdf-page-v1.1`, `code-rust-ast-v1`, `code-python-ast-v1`, `dockerfile-file-v1`, `k8s-manifest-resource-v1`, `manifest-file-v1`, `code-text-paragraph-v1`. **Impact**: user 가 `kebab schema` 보고 active version 식별 불완전, version cascade audit 시 누락 risk. ### §1.5 Bug #14: empty query silent (Minor UX) ```bash kebab search "" --json # exit=0, {"hits":[],"next_cursor":null,"schema_version":"search_response.v1"} ``` empty query (또는 whitespace-only) 가 silent 0 hit return. user mistake → explicit error 가 정합. --- ## §2 Scope + non-scope ### §2.1 Included: 5 bug fix | Bug | Category | Severity | Fix type | |-----|----------|----------|----------| | #9 | wire schema | critical | capability flag hardcoded boolean → actual feature check | | #10 | config UX | medium | silent fallback → error.v1 with config_not_found | | #11 | OCR config | critical | default 600s → 60s timeout | | #13 | wire schema | medium | single field → additive array fields (backward compat) | | #14 | input validation | minor | empty query silent → error.v1 with invalid_input | ### §2.2 Out of scope - **Bug #12 (falsified)**: `inspect doc` blocks[].text 가 code parser 에서 "?" placeholder. 근본: `.text` 아님, `.code` field 정상 emit. user workflow 는 `.code` 로 접근 가능 → spec 범위 외. - dogfood report §12 의 다른 axis (ranking bias, multi-root caveat) → 별도 phase. --- ## §3 Decisions ### §3.1 Bug #9: capabilities 정정 **Decision**: `schema.rs::capabilities_snapshot()` 의 두 field 를 true 로 update. ```rust fn capabilities_snapshot() -> Capabilities { Capabilities { json_mode: true, ingest_progress: true, ingest_cancellation: true, rag_multi_turn: true, search_cache: true, incremental_ingest: true, streaming_ask: true, // ← WAS FALSE, actual TRUE http_daemon: false, // ← preserved (not-impl, separate sub-item) mcp_server: true, single_file_ingest: true, // ← WAS FALSE, actual TRUE bulk_search: true, } } ``` **Rationale**: actual implementation 이 production-grade streaming ask + single-file ingest 지원. schema report 가 reality 와 정합되어야 agent routing 정확함. ### §3.2 Bug #10: config_not_found error **Decision**: `kebab-config` 가 자체 error type `ConfigNotFound` 정의, `kebab-app::error_wire` 가 classify arm 추가. Pseudo-code: ```rust // crates/kebab-config/src/lib.rs (또는 적절한 error module) #[derive(Debug, thiserror::Error)] #[error("config file does not exist: {path}")] pub struct ConfigNotFound { pub path: PathBuf, } // Config::load 안: pub fn load(opt_path: Option<&Path>) -> anyhow::Result { match opt_path { Some(p) if !p.exists() => Err(anyhow::Error::new(ConfigNotFound { path: p.to_path_buf() })), Some(p) => Self::from_file(p), None => Self::from_xdg_default_or_defaults(), } } ``` Classify arm in `kebab-app/src/error_wire.rs`: ```rust if let Some(e) = err.downcast_ref::() { return ErrorV1 { schema_version: ERROR_V1_ID.to_string(), code: "config_not_found".to_string(), message: format!("config file does not exist: {}", e.path.display()), details: json!({ "path": e.path }), hint: Some("verify --config argument; use --config to point to a writable toml file, or omit to use XDG default".to_string()), }; } ``` **Exit code**: 2 (config error, not 0 silent). ### §3.3 Bug #11: OCR timeout 60s **Decision**: `default_pdf_ocr_request_timeout_secs()` → 600 에서 60 으로 감소. ```rust fn default_pdf_ocr_request_timeout_secs() -> u64 { 60 // 1 min, production-friendly per dogfood evidence } ``` **Doc-comment 추가**: ```rust /// Default OCR request timeout in seconds. Most pages complete in 6-32s. /// Set to upper-bound valid throughput; exceeding 60s may indicate /// Ollama unavailability or very dense/high-res pages. /// Override via [pdf.ocr] request_timeout_secs = N in config.toml. ``` ### §3.4 Bug #13: active_parsers + active_chunkers (additive) **Decision**: wire schema additive minor — `Models` struct 에 두 배열 추가, 기존 single field 보존 (backward compat). `kebab-store-sqlite` 가 fetch methods 제공. **Store API** (crates/kebab-store-sqlite/src/lib.rs): ```rust impl SqliteStore { /// SELECT DISTINCT parser_version FROM documents WHERE parser_version IS NOT NULL ORDER BY parser_version pub fn fetch_distinct_parser_versions(&self) -> anyhow::Result> { let conn = self.conn()?; let mut stmt = conn.prepare( "SELECT DISTINCT parser_version FROM documents WHERE parser_version IS NOT NULL ORDER BY parser_version" )?; let rows = stmt.query_map([], |row| row.get::<_, String>(0))?; let mut out = Vec::new(); for r in rows { out.push(r?); } Ok(out) } pub fn fetch_distinct_chunker_versions(&self) -> anyhow::Result> { let conn = self.conn()?; let mut stmt = conn.prepare( "SELECT DISTINCT chunker_version FROM chunks WHERE chunker_version IS NOT NULL ORDER BY chunker_version" )?; let rows = stmt.query_map([], |row| row.get::<_, String>(0))?; let mut out = Vec::new(); for r in rows { out.push(r?); } Ok(out) } } ``` **Models struct** (crates/kebab-app/src/schema.rs): ```rust pub struct Models { /// Deprecated since v0.20.1. Use active_parsers for multi-parser corpus. /// Reports default parser version (markdown path). pub parser_version: String, /// Deprecated since v0.20.1. Use active_chunkers for multi-chunker corpus. pub chunker_version: String, /// All parser versions active in corpus (v0.20.1+). May be empty if corpus is empty. pub active_parsers: Vec, /// All chunker versions active in corpus (v0.20.1+). May be empty if corpus is empty. pub active_chunkers: Vec, pub embedding_version: String, pub prompt_template_version: String, pub index_version: String, pub corpus_revision: u64, } ``` **Computation** (crates/kebab-app/src/schema.rs::collect_models): ```rust let store = open_store_for_stats(cfg)?; let active_parsers = store.fetch_distinct_parser_versions().unwrap_or_default(); let active_chunkers = store.fetch_distinct_chunker_versions().unwrap_or_default(); Ok(Models { parser_version: active_parsers.first().cloned().unwrap_or_else(|| kebab_parse_md::PARSER_VERSION.to_string()), chunker_version: active_chunkers.first().cloned().unwrap_or_else(|| kebab_chunk::md_heading_v1::VERSION_LABEL.to_string()), active_parsers, active_chunkers, ... }) ``` **Fallback**: markdown-fallback 유지. 기존 `parser_version` + `chunker_version` hardcode 보존 (backward compat). ### §3.5 Bug #14: empty query validation **Decision**: `search` 및 `ask` command 모두에 query empty check + error.v1 emit. **Search command** (crates/kebab-cli/src/main.rs::search arm): ```rust if let Some(q) = query.as_ref() { if q.trim().is_empty() { return Err(anyhow::Error::new(kebab_app::StructuredError(ErrorV1 { schema_version: ERROR_V1_ID.to_string(), code: "invalid_input".to_string(), message: "query is empty; provide a non-empty search term or use --bulk".into(), details: Value::Null, hint: Some("e.g. `kebab search 'rust async'` or `kebab search --bulk < queries.ndjson`".into()), }))); } } ``` **Ask command** (crates/kebab-cli/src/main.rs::ask arm): ```rust if query.trim().is_empty() { return Err(anyhow::Error::new(kebab_app::StructuredError(ErrorV1 { schema_version: ERROR_V1_ID.to_string(), code: "invalid_input".to_string(), message: "query is empty; provide a non-empty prompt".into(), details: Value::Null, hint: Some("e.g. `kebab ask 'explain this code'`".into()), }))); } ``` Both commands now validate; no silent fallback. --- ## §4 Implementation specification ### §4.1 Files to modify 1. **Bug #9 capability fix**: `crates/kebab-app/src/schema.rs` - line 137–151: `capabilities_snapshot()` — flip `streaming_ask: false` → `true`, `single_file_ingest: false` → `true`. - add test: `capabilities_streaming_ask_matches_cli_surface()`. - add test: `capabilities_single_file_ingest_matches_cli_surface()`. 2. **Bug #10 config_not_found**: Two files - `crates/kebab-config/src/lib.rs`: - Define `ConfigNotFound` error struct (with `#[derive(Debug, thiserror::Error)]`). - Modify `Config::load(opt_path: Option<&Path>)` — path existence check, `return Err(anyhow::Error::new(ConfigNotFound { ... }))`. - add test: `config_load_explicit_nonexistent_path_returns_error()`. - `crates/kebab-app/src/error_wire.rs`: - Add classify arm after existing `ConfigInvalid` case. - Map `kebab_config::ConfigNotFound` → `ErrorV1 { code: "config_not_found", ... }`. 3. **Bug #13 schema.models**: Three components - `crates/kebab-store-sqlite/src/lib.rs`: - Implement `fetch_distinct_parser_versions()` — SQL SELECT DISTINCT on documents.parser_version + ORDER BY. - Implement `fetch_distinct_chunker_versions()` — SQL SELECT DISTINCT on chunks.chunker_version + ORDER BY. - `crates/kebab-app/src/schema.rs`: - Modify `Models` struct — add `active_parsers: Vec`, `active_chunkers: Vec` fields. - Modify computation logic (`collect_models` or equiv) — call store methods, populate arrays, fallback to markdown defaults for single fields. - add test: `schema_models_active_arrays_empty_on_empty_corpus()`. - add test: `schema_models_active_arrays_populated_after_mixed_ingest()`. - `docs/wire-schema/v1/schema.schema.json`: - `Models` object — add `"active_parsers": { "type": "array", "items": { "type": "string" } }`. - add `"active_chunkers": { "type": "array", "items": { "type": "string" } }`. - Mark deprecated in comment: `parser_version` + `chunker_version` (additive, backward compat). 4. **Bug #14 empty query validation**: `crates/kebab-cli/src/main.rs` - search command arm: add `if query.trim().is_empty()` check → error.v1 code=invalid_input. - ask command arm: add identical `if query.trim().is_empty()` check → error.v1 code=invalid_input. 5. **Wire schema v1 doc update**: `docs/wire-schema/v1/` - Update schema doc to note `active_parsers` / `active_chunkers` optional (additive). 6. **Integration**: `integrations/claude-code/kebab/SKILL.md` - Update `schema.models` surface docs — reference new `active_*` arrays for multi-version corpora. 7. **Tests** (new or extended): - `crates/kebab-cli/tests/`: invalid --config path (absolute + relative) → error.v1 + exit≠0. - `crates/kebab-cli/tests/`: empty query (search + ask) → error.v1 code=invalid_input + exit≠0. - `crates/kebab-config/tests/`: config file not found → ConfigNotFound error. - `crates/kebab-app/tests/`: mixed corpus schema — active_parsers/chunkers include all ingested versions. ### §4.2 Regression checks - Existing 1350 workspace tests: `cargo test --workspace --no-fail-fast -j 1` must pass green. - All non-bug capabilities (json_mode, ingest_progress, ingest_cancellation, rag_multi_turn, search_cache, incremental_ingest, mcp_server, bulk_search) stay true. - Default config path resolution (no --config) unchanged — silent fallback to XDG only if `--config` not passed. - Relative path behavior (cwd-relative, Rust std path::Path::exists()) preserved. - Empty corpus → empty `active_parsers` / `active_chunkers` array (not null, not error). - Existing hardcoded `parser_version` + `chunker_version` fields continue to report markdown defaults (backward compat). - Schema version bump not required (wire schema additive minor, backward compat). --- ## §5 Acceptance criteria | # | Criterion | Evidence | |----|-----------|----------| | AC-1 | `kebab schema --json` emit `streaming_ask: true` + `single_file_ingest: true` | `cargo test -p kebab-app capabilities_* -j 4` green | | AC-2 | `kebab search "x" --config /nonexistent.toml --json` emit exit≠0 + error.v1 code=config_not_found | `cargo test -p kebab-config config_load_explicit_nonexistent_path_returns_error -j 4` green | | AC-3 | `cargo test -p kebab-config pdf_ocr_request_timeout_default_is_60s -j 4` → green | unit test confirms default = 60s (no manual timing) | | AC-4 | After mixed ingest (MD + PDF + code), `kebab schema --json` emits both `active_parsers` + `active_chunkers` arrays containing all versions | integration test pass | | AC-5 | `kebab search "" --json` and `kebab search " " --json` both emit exit≠0 + error.v1 code=invalid_input | integration test pass | | AC-6 | `kebab ask "" --json` emit exit≠0 + error.v1 code=invalid_input (ask symmetry) | integration test pass | | AC-7 | `kebab search "rust" --config nonexistent-relative.toml --json` (relative path) emit exit≠0 + error.v1 code=config_not_found | integration test pass | | AC-8 | All 1350+ workspace tests pass; no new failures | `cargo test --workspace --no-fail-fast -j 1` exit=0 | | AC-9 | Wire schema backward compat: old clients reading `parser_version` + `chunker_version` still work; `active_*` arrays optional per schema | JSON schema `additionalProperties: false` review | | AC-10 | `kebab ask --stream` still works; streaming events emitted (no regression) | manual `kebab ask --stream "explain this" 2>&1 | head -3` | --- ## §6 Risks + resolutions ### Risks - **R-1** (Bug #10): Relative path `./config.toml` must resolve from cwd, not from binary location. **Resolution**: Rust `std::path::Path::exists()` is cwd-relative; no workaround needed. - **R-2** (Bug #13): Empty corpus → empty `active_parsers` / `active_chunkers` array. **Resolution**: Unit test `schema_models_active_arrays_empty_on_empty_corpus()` mandated (AC-4). - **R-3** (resolved): `collect_models` uses no cache (every-call re-computation). `active_parsers/chunkers` reflect corpus state at invocation time. If future caching is added, `corpus_revision` increment signals invalidation — document at that time. - **R-4** (Bug #14): `ask` command validation — covered by same fix (§3.5 mandates both search + ask). - **R-5** (Bug #11): 60s may still timeout on very dense/high-res pages. **Mitigation**: User can override via `config.toml [pdf.ocr] request_timeout_secs = N`. Release notes explicitly call this out. --- --- ## §7 Parent spec deviation (HOTFIXES handoff) **F-11 MEDIUM finding**: parent spec `2026-04-27-kebab-final-form-design.md` (frozen) specifies PDF OCR request_timeout_secs = 600s (§1000 + §1628 OQ-1, rationale: "CPU 환경 105s 의 5x 여유"). Bug #11 (dogfood evidence) contradicts — 600s causes timeouts; 60s production-optimal. **Deviation handling**: 1. Parent spec stays frozen (no edits). 2. **HOTFIXES entry (executor Step N)**: `tasks/HOTFIXES.md` receives dated entry: ```markdown 2026-05-27 — PDF OCR request_timeout_secs default 600s → 60s (v0.20.0 bugfix3 dogfood evidence). Bug #11. ``` 3. **Parent spec cross-link (executor Step N)**: parent spec `2026-04-27-kebab-final-form-design.md` receives inline comment at §1000 (default value code block) or §1628 (OQ-1 paragraph): ```markdown ``` **Parent spec invariant**: No changes to parent spec text; only cross-link comment + HOTFIXES.md entry. Frozen design contract preserved. --- ## §8 References - [Dogfood report](../../../.omc/reviews/2026-05-27-v0.20-final-dogfood-report.md) — 5 bugs discovered + decisions. - [Parent spec (frozen contract)](2026-04-27-kebab-final-form-design.md) — §1, §2, §4 (capabilities, error handling, JSON schema, config XDG). - `crates/kebab-app/src/schema.rs:137–151` (capabilities_snapshot). - `crates/kebab-config/src/lib.rs` (Config::load, default_pdf_ocr_request_timeout_secs). - `crates/kebab-app/src/error_wire.rs` (classify ConfigNotFound). - `crates/kebab-store-sqlite/src/lib.rs` (fetch_distinct_parser_versions, fetch_distinct_chunker_versions). - `crates/kebab-cli/src/main.rs` (search + ask query validation). - `docs/wire-schema/v1/schema.schema.json` (Models + Capabilities objects). - `tasks/HOTFIXES.md` (2026-05-27 entry, Bug #11 deviation record).