Files
kebab/docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix3-spec.md
altair823 46e99470eb docs(superpowers): v0.20 sub-item 1 bugfix1/2/3 specs + plans + DOGFOOD.md
3-round dogfood-driven fix cycle 의 산출물:

- bugfix1 (Bug #2/#3/#4): spec 964 line + plan 848 line
- bugfix2 (Bug #6/#7, #8 falsified): spec 308 line + plan 388 line
- bugfix3 (Bug #9/#10/#11/#13/#14, #12 falsified): spec 410 line + plan 1043 line
- docs/DOGFOOD.md: 전방위 dogfood checklist 의 전체 (§0 environment ~ §13 reference corpus)

각 round 의 spec/plan 가 critic + verifier round 2 closure ACCEPT 후 frozen. dogfood-driven evidence 기반.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 01:21:34 +00:00

411 lines
19 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "v0.20.0 sub-item 1 bugfix round 3 — final-dogfood findings"
created: 2026-05-27
status: DRAFT
round: 1c
parent_spec: 2026-04-27-kebab-final-form-design.md
contract_sections:
- "1.1 (ask streaming)"
- "2.2 (error handling)"
- "2.4 (JSON wire schema)"
- "3.1 (config XDG)"
- "4.1 (capabilities schema)"
source_report: .omc/reviews/2026-05-27-v0.20-final-dogfood-report.md
---
# v0.20.0 sub-item 1 bugfix round 3 — final-dogfood findings
Post-bugfix2 final dogfood (2026-05-27) 에서 발견된 **5개 bug** 의 fix design. PR #189 force-update (base=main). Spec scope: root cause + fix decision + acceptance criteria + parent spec traceability. Bug #12 falsified (scope 외). Fix 5개 모두 trivial ~ small refactor (기존 1350 test + 추가 5+ test).
---
## §1 Problem statement
### §1.1 Bug #9: capabilities false negative (Critical)
`kebab schema --json``capabilities.streaming_ask``capabilities.single_file_ingest` 가 모두 `false` hardcoded. 그러나 실제 구현:
- `kebab ask --stream``answer_event.v1` ndjson events 정상 emit (191 events 검증).
- `kebab ingest-file <path>``ingest_report.v1` 신규/갱신 정상.
- `kebab ingest-stdin --title <T>` → 정상.
**Impact**: MCP host, Claude Code skill 등 agent 가 `capabilities: { streaming_ask: false, single_file_ingest: false }` 보고 routing 결정 시 false negative. user 가 실제 동작하는 feature 를 사용 불가능하다고 오인.
### §1.2 Bug #10: config fail-fast (UX)
```bash
kebab search "rust" --config /tmp/nonexistent.toml --json
# exit=0, {"hits":[],"schema_version":"search_response.v1"}
```
explicit path 가 missing 시 silent fallback to default config (XDG path). debugging nightmare — typo 또는 wrong path 가 0 hit 으로만 surface.
### §1.3 Bug #11: OCR timeout 600s (Critical UX)
`config.pdf.ocr.request_timeout_secs = 600` (10분/page default). metro-korea.pdf dogfood 증거:
- page 8 + page 13 에서 Ollama remote 의 slow response → 600s 완전 timeout.
- 결과: `ms: 600000, chars: 0, skipped: true` emit → 본문 indexed 안 됨 + 20분 cost waste.
**Production impact**: 사용자가 ingest 완료 signal 못 받음, 일부 page 검색 불가.
### §1.4 Bug #13: schema.models single value (UX)
```json
{
"chunker_version": "md-heading-v1",
"parser_version": "md-frontmatter-v2",
...
}
```
그러나 corpus 안 multi-active:
- parsers: `md-frontmatter-v2`, `pdf-text-v1`, `code-rust-v1`, `code-python-v1`, `none-v1`.
- chunkers: `md-heading-v1`, `pdf-page-v1.1`, `code-rust-ast-v1`, `code-python-ast-v1`, `dockerfile-file-v1`, `k8s-manifest-resource-v1`, `manifest-file-v1`, `code-text-paragraph-v1`.
**Impact**: user 가 `kebab schema` 보고 active version 식별 불완전, version cascade audit 시 누락 risk.
### §1.5 Bug #14: empty query silent (Minor UX)
```bash
kebab search "" --json
# exit=0, {"hits":[],"next_cursor":null,"schema_version":"search_response.v1"}
```
empty query (또는 whitespace-only) 가 silent 0 hit return. user mistake → explicit error 가 정합.
---
## §2 Scope + non-scope
### §2.1 Included: 5 bug fix
| Bug | Category | Severity | Fix type |
|-----|----------|----------|----------|
| #9 | wire schema | critical | capability flag hardcoded boolean → actual feature check |
| #10 | config UX | medium | silent fallback → error.v1 with config_not_found |
| #11 | OCR config | critical | default 600s → 60s timeout |
| #13 | wire schema | medium | single field → additive array fields (backward compat) |
| #14 | input validation | minor | empty query silent → error.v1 with invalid_input |
### §2.2 Out of scope
- **Bug #12 (falsified)**: `inspect doc` blocks[].text 가 code parser 에서 "?" placeholder. 근본: `.text` 아님, `.code` field 정상 emit. user workflow 는 `.code` 로 접근 가능 → spec 범위 외.
- dogfood report §12 의 다른 axis (ranking bias, multi-root caveat) → 별도 phase.
---
## §3 Decisions
### §3.1 Bug #9: capabilities 정정
**Decision**: `schema.rs::capabilities_snapshot()` 의 두 field 를 true 로 update.
```rust
fn capabilities_snapshot() -> Capabilities {
Capabilities {
json_mode: true,
ingest_progress: true,
ingest_cancellation: true,
rag_multi_turn: true,
search_cache: true,
incremental_ingest: true,
streaming_ask: true, // ← WAS FALSE, actual TRUE
http_daemon: false, // ← preserved (not-impl, separate sub-item)
mcp_server: true,
single_file_ingest: true, // ← WAS FALSE, actual TRUE
bulk_search: true,
}
}
```
**Rationale**: actual implementation 이 production-grade streaming ask + single-file ingest 지원. schema report 가 reality 와 정합되어야 agent routing 정확함.
### §3.2 Bug #10: config_not_found error
**Decision**: `kebab-config` 가 자체 error type `ConfigNotFound` 정의, `kebab-app::error_wire` 가 classify arm 추가.
Pseudo-code:
```rust
// crates/kebab-config/src/lib.rs (또는 적절한 error module)
#[derive(Debug, thiserror::Error)]
#[error("config file does not exist: {path}")]
pub struct ConfigNotFound {
pub path: PathBuf,
}
// Config::load 안:
pub fn load(opt_path: Option<&Path>) -> anyhow::Result<Config> {
match opt_path {
Some(p) if !p.exists() => Err(anyhow::Error::new(ConfigNotFound { path: p.to_path_buf() })),
Some(p) => Self::from_file(p),
None => Self::from_xdg_default_or_defaults(),
}
}
```
Classify arm in `kebab-app/src/error_wire.rs`:
```rust
if let Some(e) = err.downcast_ref::<kebab_config::ConfigNotFound>() {
return ErrorV1 {
schema_version: ERROR_V1_ID.to_string(),
code: "config_not_found".to_string(),
message: format!("config file does not exist: {}", e.path.display()),
details: json!({ "path": e.path }),
hint: Some("verify --config argument; use --config to point to a writable toml file, or omit to use XDG default".to_string()),
};
}
```
**Exit code**: 2 (config error, not 0 silent).
### §3.3 Bug #11: OCR timeout 60s
**Decision**: `default_pdf_ocr_request_timeout_secs()` → 600 에서 60 으로 감소.
```rust
fn default_pdf_ocr_request_timeout_secs() -> u64 {
60 // 1 min, production-friendly per dogfood evidence
}
```
**Doc-comment 추가**:
```rust
/// Default OCR request timeout in seconds. Most pages complete in 6-32s.
/// Set to upper-bound valid throughput; exceeding 60s may indicate
/// Ollama unavailability or very dense/high-res pages.
/// Override via [pdf.ocr] request_timeout_secs = N in config.toml.
```
### §3.4 Bug #13: active_parsers + active_chunkers (additive)
**Decision**: wire schema additive minor — `Models` struct 에 두 배열 추가, 기존 single field 보존 (backward compat). `kebab-store-sqlite` 가 fetch methods 제공.
**Store API** (crates/kebab-store-sqlite/src/lib.rs):
```rust
impl SqliteStore {
/// SELECT DISTINCT parser_version FROM documents WHERE parser_version IS NOT NULL ORDER BY parser_version
pub fn fetch_distinct_parser_versions(&self) -> anyhow::Result<Vec<String>> {
let conn = self.conn()?;
let mut stmt = conn.prepare(
"SELECT DISTINCT parser_version FROM documents
WHERE parser_version IS NOT NULL
ORDER BY parser_version"
)?;
let rows = stmt.query_map([], |row| row.get::<_, String>(0))?;
let mut out = Vec::new();
for r in rows { out.push(r?); }
Ok(out)
}
pub fn fetch_distinct_chunker_versions(&self) -> anyhow::Result<Vec<String>> {
let conn = self.conn()?;
let mut stmt = conn.prepare(
"SELECT DISTINCT chunker_version FROM chunks
WHERE chunker_version IS NOT NULL
ORDER BY chunker_version"
)?;
let rows = stmt.query_map([], |row| row.get::<_, String>(0))?;
let mut out = Vec::new();
for r in rows { out.push(r?); }
Ok(out)
}
}
```
**Models struct** (crates/kebab-app/src/schema.rs):
```rust
pub struct Models {
/// Deprecated since v0.20.1. Use active_parsers for multi-parser corpus.
/// Reports default parser version (markdown path).
pub parser_version: String,
/// Deprecated since v0.20.1. Use active_chunkers for multi-chunker corpus.
pub chunker_version: String,
/// All parser versions active in corpus (v0.20.1+). May be empty if corpus is empty.
pub active_parsers: Vec<String>,
/// All chunker versions active in corpus (v0.20.1+). May be empty if corpus is empty.
pub active_chunkers: Vec<String>,
pub embedding_version: String,
pub prompt_template_version: String,
pub index_version: String,
pub corpus_revision: u64,
}
```
**Computation** (crates/kebab-app/src/schema.rs::collect_models):
```rust
let store = open_store_for_stats(cfg)?;
let active_parsers = store.fetch_distinct_parser_versions().unwrap_or_default();
let active_chunkers = store.fetch_distinct_chunker_versions().unwrap_or_default();
Ok(Models {
parser_version: active_parsers.first().cloned().unwrap_or_else(|| kebab_parse_md::PARSER_VERSION.to_string()),
chunker_version: active_chunkers.first().cloned().unwrap_or_else(|| kebab_chunk::md_heading_v1::VERSION_LABEL.to_string()),
active_parsers,
active_chunkers,
...
})
```
**Fallback**: markdown-fallback 유지. 기존 `parser_version` + `chunker_version` hardcode 보존 (backward compat).
### §3.5 Bug #14: empty query validation
**Decision**: `search``ask` command 모두에 query empty check + error.v1 emit.
**Search command** (crates/kebab-cli/src/main.rs::search arm):
```rust
if let Some(q) = query.as_ref() {
if q.trim().is_empty() {
return Err(anyhow::Error::new(kebab_app::StructuredError(ErrorV1 {
schema_version: ERROR_V1_ID.to_string(),
code: "invalid_input".to_string(),
message: "query is empty; provide a non-empty search term or use --bulk".into(),
details: Value::Null,
hint: Some("e.g. `kebab search 'rust async'` or `kebab search --bulk < queries.ndjson`".into()),
})));
}
}
```
**Ask command** (crates/kebab-cli/src/main.rs::ask arm):
```rust
if query.trim().is_empty() {
return Err(anyhow::Error::new(kebab_app::StructuredError(ErrorV1 {
schema_version: ERROR_V1_ID.to_string(),
code: "invalid_input".to_string(),
message: "query is empty; provide a non-empty prompt".into(),
details: Value::Null,
hint: Some("e.g. `kebab ask 'explain this code'`".into()),
})));
}
```
Both commands now validate; no silent fallback.
---
## §4 Implementation specification
### §4.1 Files to modify
1. **Bug #9 capability fix**: `crates/kebab-app/src/schema.rs`
- line 137151: `capabilities_snapshot()` — flip `streaming_ask: false``true`, `single_file_ingest: false``true`.
- add test: `capabilities_streaming_ask_matches_cli_surface()`.
- add test: `capabilities_single_file_ingest_matches_cli_surface()`.
2. **Bug #10 config_not_found**: Two files
- `crates/kebab-config/src/lib.rs`:
- Define `ConfigNotFound` error struct (with `#[derive(Debug, thiserror::Error)]`).
- Modify `Config::load(opt_path: Option<&Path>)` — path existence check, `return Err(anyhow::Error::new(ConfigNotFound { ... }))`.
- add test: `config_load_explicit_nonexistent_path_returns_error()`.
- `crates/kebab-app/src/error_wire.rs`:
- Add classify arm after existing `ConfigInvalid` case.
- Map `kebab_config::ConfigNotFound``ErrorV1 { code: "config_not_found", ... }`.
3. **Bug #13 schema.models**: Three components
- `crates/kebab-store-sqlite/src/lib.rs`:
- Implement `fetch_distinct_parser_versions()` — SQL SELECT DISTINCT on documents.parser_version + ORDER BY.
- Implement `fetch_distinct_chunker_versions()` — SQL SELECT DISTINCT on chunks.chunker_version + ORDER BY.
- `crates/kebab-app/src/schema.rs`:
- Modify `Models` struct — add `active_parsers: Vec<String>`, `active_chunkers: Vec<String>` fields.
- Modify computation logic (`collect_models` or equiv) — call store methods, populate arrays, fallback to markdown defaults for single fields.
- add test: `schema_models_active_arrays_empty_on_empty_corpus()`.
- add test: `schema_models_active_arrays_populated_after_mixed_ingest()`.
- `docs/wire-schema/v1/schema.schema.json`:
- `Models` object — add `"active_parsers": { "type": "array", "items": { "type": "string" } }`.
- add `"active_chunkers": { "type": "array", "items": { "type": "string" } }`.
- Mark deprecated in comment: `parser_version` + `chunker_version` (additive, backward compat).
4. **Bug #14 empty query validation**: `crates/kebab-cli/src/main.rs`
- search command arm: add `if query.trim().is_empty()` check → error.v1 code=invalid_input.
- ask command arm: add identical `if query.trim().is_empty()` check → error.v1 code=invalid_input.
5. **Wire schema v1 doc update**: `docs/wire-schema/v1/`
- Update schema doc to note `active_parsers` / `active_chunkers` optional (additive).
6. **Integration**: `integrations/claude-code/kebab/SKILL.md`
- Update `schema.models` surface docs — reference new `active_*` arrays for multi-version corpora.
7. **Tests** (new or extended):
- `crates/kebab-cli/tests/`: invalid --config path (absolute + relative) → error.v1 + exit≠0.
- `crates/kebab-cli/tests/`: empty query (search + ask) → error.v1 code=invalid_input + exit≠0.
- `crates/kebab-config/tests/`: config file not found → ConfigNotFound error.
- `crates/kebab-app/tests/`: mixed corpus schema — active_parsers/chunkers include all ingested versions.
### §4.2 Regression checks
- Existing 1350 workspace tests: `cargo test --workspace --no-fail-fast -j 1` must pass green.
- All non-bug capabilities (json_mode, ingest_progress, ingest_cancellation, rag_multi_turn, search_cache, incremental_ingest, mcp_server, bulk_search) stay true.
- Default config path resolution (no --config) unchanged — silent fallback to XDG only if `--config` not passed.
- Relative path behavior (cwd-relative, Rust std path::Path::exists()) preserved.
- Empty corpus → empty `active_parsers` / `active_chunkers` array (not null, not error).
- Existing hardcoded `parser_version` + `chunker_version` fields continue to report markdown defaults (backward compat).
- Schema version bump not required (wire schema additive minor, backward compat).
---
## §5 Acceptance criteria
| # | Criterion | Evidence |
|----|-----------|----------|
| AC-1 | `kebab schema --json` emit `streaming_ask: true` + `single_file_ingest: true` | `cargo test -p kebab-app capabilities_* -j 4` green |
| AC-2 | `kebab search "x" --config /nonexistent.toml --json` emit exit≠0 + error.v1 code=config_not_found | `cargo test -p kebab-config config_load_explicit_nonexistent_path_returns_error -j 4` green |
| AC-3 | `cargo test -p kebab-config pdf_ocr_request_timeout_default_is_60s -j 4` → green | unit test confirms default = 60s (no manual timing) |
| AC-4 | After mixed ingest (MD + PDF + code), `kebab schema --json` emits both `active_parsers` + `active_chunkers` arrays containing all versions | integration test pass |
| AC-5 | `kebab search "" --json` and `kebab search " " --json` both emit exit≠0 + error.v1 code=invalid_input | integration test pass |
| AC-6 | `kebab ask "" --json` emit exit≠0 + error.v1 code=invalid_input (ask symmetry) | integration test pass |
| AC-7 | `kebab search "rust" --config nonexistent-relative.toml --json` (relative path) emit exit≠0 + error.v1 code=config_not_found | integration test pass |
| AC-8 | All 1350+ workspace tests pass; no new failures | `cargo test --workspace --no-fail-fast -j 1` exit=0 |
| AC-9 | Wire schema backward compat: old clients reading `parser_version` + `chunker_version` still work; `active_*` arrays optional per schema | JSON schema `additionalProperties: false` review |
| AC-10 | `kebab ask --stream` still works; streaming events emitted (no regression) | manual `kebab ask --stream "explain this" 2>&1 | head -3` |
---
## §6 Risks + resolutions
### Risks
- **R-1** (Bug #10): Relative path `./config.toml` must resolve from cwd, not from binary location. **Resolution**: Rust `std::path::Path::exists()` is cwd-relative; no workaround needed.
- **R-2** (Bug #13): Empty corpus → empty `active_parsers` / `active_chunkers` array. **Resolution**: Unit test `schema_models_active_arrays_empty_on_empty_corpus()` mandated (AC-4).
- **R-3** (resolved): `collect_models` uses no cache (every-call re-computation). `active_parsers/chunkers` reflect corpus state at invocation time. If future caching is added, `corpus_revision` increment signals invalidation — document at that time.
- **R-4** (Bug #14): `ask` command validation — covered by same fix (§3.5 mandates both search + ask).
- **R-5** (Bug #11): 60s may still timeout on very dense/high-res pages. **Mitigation**: User can override via `config.toml [pdf.ocr] request_timeout_secs = N`. Release notes explicitly call this out.
---
---
## §7 Parent spec deviation (HOTFIXES handoff)
**F-11 MEDIUM finding**: parent spec `2026-04-27-kebab-final-form-design.md` (frozen) specifies PDF OCR request_timeout_secs = 600s (§1000 + §1628 OQ-1, rationale: "CPU 환경 105s 의 5x 여유"). Bug #11 (dogfood evidence) contradicts — 600s causes timeouts; 60s production-optimal.
**Deviation handling**:
1. Parent spec stays frozen (no edits).
2. **HOTFIXES entry (executor Step N)**: `tasks/HOTFIXES.md` receives dated entry:
```markdown
2026-05-27 — PDF OCR request_timeout_secs default 600s → 60s (v0.20.0 bugfix3 dogfood evidence). Bug #11.
```
3. **Parent spec cross-link (executor Step N)**: parent spec `2026-04-27-kebab-final-form-design.md` receives inline comment at §1000 (default value code block) or §1628 (OQ-1 paragraph):
```markdown
<!-- HOTFIX 2026-05-27: default 60s (Bug #11). See tasks/HOTFIXES.md 2026-05-27 entry. -->
```
**Parent spec invariant**: No changes to parent spec text; only cross-link comment + HOTFIXES.md entry. Frozen design contract preserved.
---
## §8 References
- [Dogfood report](../../../.omc/reviews/2026-05-27-v0.20-final-dogfood-report.md) — 5 bugs discovered + decisions.
- [Parent spec (frozen contract)](2026-04-27-kebab-final-form-design.md) — §1, §2, §4 (capabilities, error handling, JSON schema, config XDG).
- `crates/kebab-app/src/schema.rs:137151` (capabilities_snapshot).
- `crates/kebab-config/src/lib.rs` (Config::load, default_pdf_ocr_request_timeout_secs).
- `crates/kebab-app/src/error_wire.rs` (classify ConfigNotFound).
- `crates/kebab-store-sqlite/src/lib.rs` (fetch_distinct_parser_versions, fetch_distinct_chunker_versions).
- `crates/kebab-cli/src/main.rs` (search + ask query validation).
- `docs/wire-schema/v1/schema.schema.json` (Models + Capabilities objects).
- `tasks/HOTFIXES.md` (2026-05-27 entry, Bug #11 deviation record).