3-round dogfood-driven fix cycle 의 산출물: - bugfix1 (Bug #2/#3/#4): spec 964 line + plan 848 line - bugfix2 (Bug #6/#7, #8 falsified): spec 308 line + plan 388 line - bugfix3 (Bug #9/#10/#11/#13/#14, #12 falsified): spec 410 line + plan 1043 line - docs/DOGFOOD.md: 전방위 dogfood checklist 의 전체 (§0 environment ~ §13 reference corpus) 각 round 의 spec/plan 가 critic + verifier round 2 closure ACCEPT 후 frozen. dogfood-driven evidence 기반. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
411 lines
19 KiB
Markdown
411 lines
19 KiB
Markdown
---
|
||
title: "v0.20.0 sub-item 1 bugfix round 3 — final-dogfood findings"
|
||
created: 2026-05-27
|
||
status: DRAFT
|
||
round: 1c
|
||
parent_spec: 2026-04-27-kebab-final-form-design.md
|
||
contract_sections:
|
||
- "1.1 (ask streaming)"
|
||
- "2.2 (error handling)"
|
||
- "2.4 (JSON wire schema)"
|
||
- "3.1 (config XDG)"
|
||
- "4.1 (capabilities schema)"
|
||
source_report: .omc/reviews/2026-05-27-v0.20-final-dogfood-report.md
|
||
---
|
||
|
||
# v0.20.0 sub-item 1 bugfix round 3 — final-dogfood findings
|
||
|
||
Post-bugfix2 final dogfood (2026-05-27) 에서 발견된 **5개 bug** 의 fix design. PR #189 force-update (base=main). Spec scope: root cause + fix decision + acceptance criteria + parent spec traceability. Bug #12 falsified (scope 외). Fix 5개 모두 trivial ~ small refactor (기존 1350 test + 추가 5+ test).
|
||
|
||
---
|
||
|
||
## §1 Problem statement
|
||
|
||
### §1.1 Bug #9: capabilities false negative (Critical)
|
||
|
||
`kebab schema --json` 의 `capabilities.streaming_ask` 와 `capabilities.single_file_ingest` 가 모두 `false` hardcoded. 그러나 실제 구현:
|
||
- `kebab ask --stream` → `answer_event.v1` ndjson events 정상 emit (191 events 검증).
|
||
- `kebab ingest-file <path>` → `ingest_report.v1` 신규/갱신 정상.
|
||
- `kebab ingest-stdin --title <T>` → 정상.
|
||
|
||
**Impact**: MCP host, Claude Code skill 등 agent 가 `capabilities: { streaming_ask: false, single_file_ingest: false }` 보고 routing 결정 시 false negative. user 가 실제 동작하는 feature 를 사용 불가능하다고 오인.
|
||
|
||
### §1.2 Bug #10: config fail-fast (UX)
|
||
|
||
```bash
|
||
kebab search "rust" --config /tmp/nonexistent.toml --json
|
||
# exit=0, {"hits":[],"schema_version":"search_response.v1"}
|
||
```
|
||
|
||
explicit path 가 missing 시 silent fallback to default config (XDG path). debugging nightmare — typo 또는 wrong path 가 0 hit 으로만 surface.
|
||
|
||
### §1.3 Bug #11: OCR timeout 600s (Critical UX)
|
||
|
||
`config.pdf.ocr.request_timeout_secs = 600` (10분/page default). metro-korea.pdf dogfood 증거:
|
||
- page 8 + page 13 에서 Ollama remote 의 slow response → 600s 완전 timeout.
|
||
- 결과: `ms: 600000, chars: 0, skipped: true` emit → 본문 indexed 안 됨 + 20분 cost waste.
|
||
|
||
**Production impact**: 사용자가 ingest 완료 signal 못 받음, 일부 page 검색 불가.
|
||
|
||
### §1.4 Bug #13: schema.models single value (UX)
|
||
|
||
```json
|
||
{
|
||
"chunker_version": "md-heading-v1",
|
||
"parser_version": "md-frontmatter-v2",
|
||
...
|
||
}
|
||
```
|
||
|
||
그러나 corpus 안 multi-active:
|
||
- parsers: `md-frontmatter-v2`, `pdf-text-v1`, `code-rust-v1`, `code-python-v1`, `none-v1`.
|
||
- chunkers: `md-heading-v1`, `pdf-page-v1.1`, `code-rust-ast-v1`, `code-python-ast-v1`, `dockerfile-file-v1`, `k8s-manifest-resource-v1`, `manifest-file-v1`, `code-text-paragraph-v1`.
|
||
|
||
**Impact**: user 가 `kebab schema` 보고 active version 식별 불완전, version cascade audit 시 누락 risk.
|
||
|
||
### §1.5 Bug #14: empty query silent (Minor UX)
|
||
|
||
```bash
|
||
kebab search "" --json
|
||
# exit=0, {"hits":[],"next_cursor":null,"schema_version":"search_response.v1"}
|
||
```
|
||
|
||
empty query (또는 whitespace-only) 가 silent 0 hit return. user mistake → explicit error 가 정합.
|
||
|
||
---
|
||
|
||
## §2 Scope + non-scope
|
||
|
||
### §2.1 Included: 5 bug fix
|
||
|
||
| Bug | Category | Severity | Fix type |
|
||
|-----|----------|----------|----------|
|
||
| #9 | wire schema | critical | capability flag hardcoded boolean → actual feature check |
|
||
| #10 | config UX | medium | silent fallback → error.v1 with config_not_found |
|
||
| #11 | OCR config | critical | default 600s → 60s timeout |
|
||
| #13 | wire schema | medium | single field → additive array fields (backward compat) |
|
||
| #14 | input validation | minor | empty query silent → error.v1 with invalid_input |
|
||
|
||
### §2.2 Out of scope
|
||
|
||
- **Bug #12 (falsified)**: `inspect doc` blocks[].text 가 code parser 에서 "?" placeholder. 근본: `.text` 아님, `.code` field 정상 emit. user workflow 는 `.code` 로 접근 가능 → spec 범위 외.
|
||
- dogfood report §12 의 다른 axis (ranking bias, multi-root caveat) → 별도 phase.
|
||
|
||
---
|
||
|
||
## §3 Decisions
|
||
|
||
### §3.1 Bug #9: capabilities 정정
|
||
|
||
**Decision**: `schema.rs::capabilities_snapshot()` 의 두 field 를 true 로 update.
|
||
|
||
```rust
|
||
fn capabilities_snapshot() -> Capabilities {
|
||
Capabilities {
|
||
json_mode: true,
|
||
ingest_progress: true,
|
||
ingest_cancellation: true,
|
||
rag_multi_turn: true,
|
||
search_cache: true,
|
||
incremental_ingest: true,
|
||
streaming_ask: true, // ← WAS FALSE, actual TRUE
|
||
http_daemon: false, // ← preserved (not-impl, separate sub-item)
|
||
mcp_server: true,
|
||
single_file_ingest: true, // ← WAS FALSE, actual TRUE
|
||
bulk_search: true,
|
||
}
|
||
}
|
||
```
|
||
|
||
**Rationale**: actual implementation 이 production-grade streaming ask + single-file ingest 지원. schema report 가 reality 와 정합되어야 agent routing 정확함.
|
||
|
||
### §3.2 Bug #10: config_not_found error
|
||
|
||
**Decision**: `kebab-config` 가 자체 error type `ConfigNotFound` 정의, `kebab-app::error_wire` 가 classify arm 추가.
|
||
|
||
Pseudo-code:
|
||
```rust
|
||
// crates/kebab-config/src/lib.rs (또는 적절한 error module)
|
||
#[derive(Debug, thiserror::Error)]
|
||
#[error("config file does not exist: {path}")]
|
||
pub struct ConfigNotFound {
|
||
pub path: PathBuf,
|
||
}
|
||
|
||
// Config::load 안:
|
||
pub fn load(opt_path: Option<&Path>) -> anyhow::Result<Config> {
|
||
match opt_path {
|
||
Some(p) if !p.exists() => Err(anyhow::Error::new(ConfigNotFound { path: p.to_path_buf() })),
|
||
Some(p) => Self::from_file(p),
|
||
None => Self::from_xdg_default_or_defaults(),
|
||
}
|
||
}
|
||
```
|
||
|
||
Classify arm in `kebab-app/src/error_wire.rs`:
|
||
```rust
|
||
if let Some(e) = err.downcast_ref::<kebab_config::ConfigNotFound>() {
|
||
return ErrorV1 {
|
||
schema_version: ERROR_V1_ID.to_string(),
|
||
code: "config_not_found".to_string(),
|
||
message: format!("config file does not exist: {}", e.path.display()),
|
||
details: json!({ "path": e.path }),
|
||
hint: Some("verify --config argument; use --config to point to a writable toml file, or omit to use XDG default".to_string()),
|
||
};
|
||
}
|
||
```
|
||
|
||
**Exit code**: 2 (config error, not 0 silent).
|
||
|
||
### §3.3 Bug #11: OCR timeout 60s
|
||
|
||
**Decision**: `default_pdf_ocr_request_timeout_secs()` → 600 에서 60 으로 감소.
|
||
|
||
```rust
|
||
fn default_pdf_ocr_request_timeout_secs() -> u64 {
|
||
60 // 1 min, production-friendly per dogfood evidence
|
||
}
|
||
```
|
||
|
||
**Doc-comment 추가**:
|
||
```rust
|
||
/// Default OCR request timeout in seconds. Most pages complete in 6-32s.
|
||
/// Set to upper-bound valid throughput; exceeding 60s may indicate
|
||
/// Ollama unavailability or very dense/high-res pages.
|
||
/// Override via [pdf.ocr] request_timeout_secs = N in config.toml.
|
||
```
|
||
|
||
### §3.4 Bug #13: active_parsers + active_chunkers (additive)
|
||
|
||
**Decision**: wire schema additive minor — `Models` struct 에 두 배열 추가, 기존 single field 보존 (backward compat). `kebab-store-sqlite` 가 fetch methods 제공.
|
||
|
||
**Store API** (crates/kebab-store-sqlite/src/lib.rs):
|
||
```rust
|
||
impl SqliteStore {
|
||
/// SELECT DISTINCT parser_version FROM documents WHERE parser_version IS NOT NULL ORDER BY parser_version
|
||
pub fn fetch_distinct_parser_versions(&self) -> anyhow::Result<Vec<String>> {
|
||
let conn = self.conn()?;
|
||
let mut stmt = conn.prepare(
|
||
"SELECT DISTINCT parser_version FROM documents
|
||
WHERE parser_version IS NOT NULL
|
||
ORDER BY parser_version"
|
||
)?;
|
||
let rows = stmt.query_map([], |row| row.get::<_, String>(0))?;
|
||
let mut out = Vec::new();
|
||
for r in rows { out.push(r?); }
|
||
Ok(out)
|
||
}
|
||
|
||
pub fn fetch_distinct_chunker_versions(&self) -> anyhow::Result<Vec<String>> {
|
||
let conn = self.conn()?;
|
||
let mut stmt = conn.prepare(
|
||
"SELECT DISTINCT chunker_version FROM chunks
|
||
WHERE chunker_version IS NOT NULL
|
||
ORDER BY chunker_version"
|
||
)?;
|
||
let rows = stmt.query_map([], |row| row.get::<_, String>(0))?;
|
||
let mut out = Vec::new();
|
||
for r in rows { out.push(r?); }
|
||
Ok(out)
|
||
}
|
||
}
|
||
```
|
||
|
||
**Models struct** (crates/kebab-app/src/schema.rs):
|
||
```rust
|
||
pub struct Models {
|
||
/// Deprecated since v0.20.1. Use active_parsers for multi-parser corpus.
|
||
/// Reports default parser version (markdown path).
|
||
pub parser_version: String,
|
||
|
||
/// Deprecated since v0.20.1. Use active_chunkers for multi-chunker corpus.
|
||
pub chunker_version: String,
|
||
|
||
/// All parser versions active in corpus (v0.20.1+). May be empty if corpus is empty.
|
||
pub active_parsers: Vec<String>,
|
||
|
||
/// All chunker versions active in corpus (v0.20.1+). May be empty if corpus is empty.
|
||
pub active_chunkers: Vec<String>,
|
||
|
||
pub embedding_version: String,
|
||
pub prompt_template_version: String,
|
||
pub index_version: String,
|
||
pub corpus_revision: u64,
|
||
}
|
||
```
|
||
|
||
**Computation** (crates/kebab-app/src/schema.rs::collect_models):
|
||
```rust
|
||
let store = open_store_for_stats(cfg)?;
|
||
let active_parsers = store.fetch_distinct_parser_versions().unwrap_or_default();
|
||
let active_chunkers = store.fetch_distinct_chunker_versions().unwrap_or_default();
|
||
|
||
Ok(Models {
|
||
parser_version: active_parsers.first().cloned().unwrap_or_else(|| kebab_parse_md::PARSER_VERSION.to_string()),
|
||
chunker_version: active_chunkers.first().cloned().unwrap_or_else(|| kebab_chunk::md_heading_v1::VERSION_LABEL.to_string()),
|
||
active_parsers,
|
||
active_chunkers,
|
||
...
|
||
})
|
||
```
|
||
|
||
**Fallback**: markdown-fallback 유지. 기존 `parser_version` + `chunker_version` hardcode 보존 (backward compat).
|
||
|
||
### §3.5 Bug #14: empty query validation
|
||
|
||
**Decision**: `search` 및 `ask` command 모두에 query empty check + error.v1 emit.
|
||
|
||
**Search command** (crates/kebab-cli/src/main.rs::search arm):
|
||
```rust
|
||
if let Some(q) = query.as_ref() {
|
||
if q.trim().is_empty() {
|
||
return Err(anyhow::Error::new(kebab_app::StructuredError(ErrorV1 {
|
||
schema_version: ERROR_V1_ID.to_string(),
|
||
code: "invalid_input".to_string(),
|
||
message: "query is empty; provide a non-empty search term or use --bulk".into(),
|
||
details: Value::Null,
|
||
hint: Some("e.g. `kebab search 'rust async'` or `kebab search --bulk < queries.ndjson`".into()),
|
||
})));
|
||
}
|
||
}
|
||
```
|
||
|
||
**Ask command** (crates/kebab-cli/src/main.rs::ask arm):
|
||
```rust
|
||
if query.trim().is_empty() {
|
||
return Err(anyhow::Error::new(kebab_app::StructuredError(ErrorV1 {
|
||
schema_version: ERROR_V1_ID.to_string(),
|
||
code: "invalid_input".to_string(),
|
||
message: "query is empty; provide a non-empty prompt".into(),
|
||
details: Value::Null,
|
||
hint: Some("e.g. `kebab ask 'explain this code'`".into()),
|
||
})));
|
||
}
|
||
```
|
||
|
||
Both commands now validate; no silent fallback.
|
||
|
||
---
|
||
|
||
## §4 Implementation specification
|
||
|
||
### §4.1 Files to modify
|
||
|
||
1. **Bug #9 capability fix**: `crates/kebab-app/src/schema.rs`
|
||
- line 137–151: `capabilities_snapshot()` — flip `streaming_ask: false` → `true`, `single_file_ingest: false` → `true`.
|
||
- add test: `capabilities_streaming_ask_matches_cli_surface()`.
|
||
- add test: `capabilities_single_file_ingest_matches_cli_surface()`.
|
||
|
||
2. **Bug #10 config_not_found**: Two files
|
||
- `crates/kebab-config/src/lib.rs`:
|
||
- Define `ConfigNotFound` error struct (with `#[derive(Debug, thiserror::Error)]`).
|
||
- Modify `Config::load(opt_path: Option<&Path>)` — path existence check, `return Err(anyhow::Error::new(ConfigNotFound { ... }))`.
|
||
- add test: `config_load_explicit_nonexistent_path_returns_error()`.
|
||
- `crates/kebab-app/src/error_wire.rs`:
|
||
- Add classify arm after existing `ConfigInvalid` case.
|
||
- Map `kebab_config::ConfigNotFound` → `ErrorV1 { code: "config_not_found", ... }`.
|
||
|
||
3. **Bug #13 schema.models**: Three components
|
||
- `crates/kebab-store-sqlite/src/lib.rs`:
|
||
- Implement `fetch_distinct_parser_versions()` — SQL SELECT DISTINCT on documents.parser_version + ORDER BY.
|
||
- Implement `fetch_distinct_chunker_versions()` — SQL SELECT DISTINCT on chunks.chunker_version + ORDER BY.
|
||
- `crates/kebab-app/src/schema.rs`:
|
||
- Modify `Models` struct — add `active_parsers: Vec<String>`, `active_chunkers: Vec<String>` fields.
|
||
- Modify computation logic (`collect_models` or equiv) — call store methods, populate arrays, fallback to markdown defaults for single fields.
|
||
- add test: `schema_models_active_arrays_empty_on_empty_corpus()`.
|
||
- add test: `schema_models_active_arrays_populated_after_mixed_ingest()`.
|
||
- `docs/wire-schema/v1/schema.schema.json`:
|
||
- `Models` object — add `"active_parsers": { "type": "array", "items": { "type": "string" } }`.
|
||
- add `"active_chunkers": { "type": "array", "items": { "type": "string" } }`.
|
||
- Mark deprecated in comment: `parser_version` + `chunker_version` (additive, backward compat).
|
||
|
||
4. **Bug #14 empty query validation**: `crates/kebab-cli/src/main.rs`
|
||
- search command arm: add `if query.trim().is_empty()` check → error.v1 code=invalid_input.
|
||
- ask command arm: add identical `if query.trim().is_empty()` check → error.v1 code=invalid_input.
|
||
|
||
5. **Wire schema v1 doc update**: `docs/wire-schema/v1/`
|
||
- Update schema doc to note `active_parsers` / `active_chunkers` optional (additive).
|
||
|
||
6. **Integration**: `integrations/claude-code/kebab/SKILL.md`
|
||
- Update `schema.models` surface docs — reference new `active_*` arrays for multi-version corpora.
|
||
|
||
7. **Tests** (new or extended):
|
||
- `crates/kebab-cli/tests/`: invalid --config path (absolute + relative) → error.v1 + exit≠0.
|
||
- `crates/kebab-cli/tests/`: empty query (search + ask) → error.v1 code=invalid_input + exit≠0.
|
||
- `crates/kebab-config/tests/`: config file not found → ConfigNotFound error.
|
||
- `crates/kebab-app/tests/`: mixed corpus schema — active_parsers/chunkers include all ingested versions.
|
||
|
||
### §4.2 Regression checks
|
||
|
||
- Existing 1350 workspace tests: `cargo test --workspace --no-fail-fast -j 1` must pass green.
|
||
- All non-bug capabilities (json_mode, ingest_progress, ingest_cancellation, rag_multi_turn, search_cache, incremental_ingest, mcp_server, bulk_search) stay true.
|
||
- Default config path resolution (no --config) unchanged — silent fallback to XDG only if `--config` not passed.
|
||
- Relative path behavior (cwd-relative, Rust std path::Path::exists()) preserved.
|
||
- Empty corpus → empty `active_parsers` / `active_chunkers` array (not null, not error).
|
||
- Existing hardcoded `parser_version` + `chunker_version` fields continue to report markdown defaults (backward compat).
|
||
- Schema version bump not required (wire schema additive minor, backward compat).
|
||
|
||
---
|
||
|
||
## §5 Acceptance criteria
|
||
|
||
| # | Criterion | Evidence |
|
||
|----|-----------|----------|
|
||
| AC-1 | `kebab schema --json` emit `streaming_ask: true` + `single_file_ingest: true` | `cargo test -p kebab-app capabilities_* -j 4` green |
|
||
| AC-2 | `kebab search "x" --config /nonexistent.toml --json` emit exit≠0 + error.v1 code=config_not_found | `cargo test -p kebab-config config_load_explicit_nonexistent_path_returns_error -j 4` green |
|
||
| AC-3 | `cargo test -p kebab-config pdf_ocr_request_timeout_default_is_60s -j 4` → green | unit test confirms default = 60s (no manual timing) |
|
||
| AC-4 | After mixed ingest (MD + PDF + code), `kebab schema --json` emits both `active_parsers` + `active_chunkers` arrays containing all versions | integration test pass |
|
||
| AC-5 | `kebab search "" --json` and `kebab search " " --json` both emit exit≠0 + error.v1 code=invalid_input | integration test pass |
|
||
| AC-6 | `kebab ask "" --json` emit exit≠0 + error.v1 code=invalid_input (ask symmetry) | integration test pass |
|
||
| AC-7 | `kebab search "rust" --config nonexistent-relative.toml --json` (relative path) emit exit≠0 + error.v1 code=config_not_found | integration test pass |
|
||
| AC-8 | All 1350+ workspace tests pass; no new failures | `cargo test --workspace --no-fail-fast -j 1` exit=0 |
|
||
| AC-9 | Wire schema backward compat: old clients reading `parser_version` + `chunker_version` still work; `active_*` arrays optional per schema | JSON schema `additionalProperties: false` review |
|
||
| AC-10 | `kebab ask --stream` still works; streaming events emitted (no regression) | manual `kebab ask --stream "explain this" 2>&1 | head -3` |
|
||
|
||
---
|
||
|
||
## §6 Risks + resolutions
|
||
|
||
### Risks
|
||
|
||
- **R-1** (Bug #10): Relative path `./config.toml` must resolve from cwd, not from binary location. **Resolution**: Rust `std::path::Path::exists()` is cwd-relative; no workaround needed.
|
||
- **R-2** (Bug #13): Empty corpus → empty `active_parsers` / `active_chunkers` array. **Resolution**: Unit test `schema_models_active_arrays_empty_on_empty_corpus()` mandated (AC-4).
|
||
- **R-3** (resolved): `collect_models` uses no cache (every-call re-computation). `active_parsers/chunkers` reflect corpus state at invocation time. If future caching is added, `corpus_revision` increment signals invalidation — document at that time.
|
||
- **R-4** (Bug #14): `ask` command validation — covered by same fix (§3.5 mandates both search + ask).
|
||
- **R-5** (Bug #11): 60s may still timeout on very dense/high-res pages. **Mitigation**: User can override via `config.toml [pdf.ocr] request_timeout_secs = N`. Release notes explicitly call this out.
|
||
|
||
---
|
||
|
||
---
|
||
|
||
## §7 Parent spec deviation (HOTFIXES handoff)
|
||
|
||
**F-11 MEDIUM finding**: parent spec `2026-04-27-kebab-final-form-design.md` (frozen) specifies PDF OCR request_timeout_secs = 600s (§1000 + §1628 OQ-1, rationale: "CPU 환경 105s 의 5x 여유"). Bug #11 (dogfood evidence) contradicts — 600s causes timeouts; 60s production-optimal.
|
||
|
||
**Deviation handling**:
|
||
1. Parent spec stays frozen (no edits).
|
||
2. **HOTFIXES entry (executor Step N)**: `tasks/HOTFIXES.md` receives dated entry:
|
||
```markdown
|
||
2026-05-27 — PDF OCR request_timeout_secs default 600s → 60s (v0.20.0 bugfix3 dogfood evidence). Bug #11.
|
||
```
|
||
3. **Parent spec cross-link (executor Step N)**: parent spec `2026-04-27-kebab-final-form-design.md` receives inline comment at §1000 (default value code block) or §1628 (OQ-1 paragraph):
|
||
```markdown
|
||
<!-- HOTFIX 2026-05-27: default 60s (Bug #11). See tasks/HOTFIXES.md 2026-05-27 entry. -->
|
||
```
|
||
|
||
**Parent spec invariant**: No changes to parent spec text; only cross-link comment + HOTFIXES.md entry. Frozen design contract preserved.
|
||
|
||
---
|
||
|
||
## §8 References
|
||
|
||
- [Dogfood report](../../../.omc/reviews/2026-05-27-v0.20-final-dogfood-report.md) — 5 bugs discovered + decisions.
|
||
- [Parent spec (frozen contract)](2026-04-27-kebab-final-form-design.md) — §1, §2, §4 (capabilities, error handling, JSON schema, config XDG).
|
||
- `crates/kebab-app/src/schema.rs:137–151` (capabilities_snapshot).
|
||
- `crates/kebab-config/src/lib.rs` (Config::load, default_pdf_ocr_request_timeout_secs).
|
||
- `crates/kebab-app/src/error_wire.rs` (classify ConfigNotFound).
|
||
- `crates/kebab-store-sqlite/src/lib.rs` (fetch_distinct_parser_versions, fetch_distinct_chunker_versions).
|
||
- `crates/kebab-cli/src/main.rs` (search + ask query validation).
|
||
- `docs/wire-schema/v1/schema.schema.json` (Models + Capabilities objects).
|
||
- `tasks/HOTFIXES.md` (2026-05-27 entry, Bug #11 deviation record).
|