Files
kebab/docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix3-spec.md
altair823 46e99470eb docs(superpowers): v0.20 sub-item 1 bugfix1/2/3 specs + plans + DOGFOOD.md
3-round dogfood-driven fix cycle 의 산출물:

- bugfix1 (Bug #2/#3/#4): spec 964 line + plan 848 line
- bugfix2 (Bug #6/#7, #8 falsified): spec 308 line + plan 388 line
- bugfix3 (Bug #9/#10/#11/#13/#14, #12 falsified): spec 410 line + plan 1043 line
- docs/DOGFOOD.md: 전방위 dogfood checklist 의 전체 (§0 environment ~ §13 reference corpus)

각 round 의 spec/plan 가 critic + verifier round 2 closure ACCEPT 후 frozen. dogfood-driven evidence 기반.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 01:21:34 +00:00

19 KiB
Raw Blame History

title, created, status, round, parent_spec, contract_sections, source_report
title created status round parent_spec contract_sections source_report
v0.20.0 sub-item 1 bugfix round 3 — final-dogfood findings 2026-05-27 DRAFT 1c 2026-04-27-kebab-final-form-design.md
1.1 (ask streaming)
2.2 (error handling)
2.4 (JSON wire schema)
3.1 (config XDG)
4.1 (capabilities schema)
.omc/reviews/2026-05-27-v0.20-final-dogfood-report.md

v0.20.0 sub-item 1 bugfix round 3 — final-dogfood findings

Post-bugfix2 final dogfood (2026-05-27) 에서 발견된 5개 bug 의 fix design. PR #189 force-update (base=main). Spec scope: root cause + fix decision + acceptance criteria + parent spec traceability. Bug #12 falsified (scope 외). Fix 5개 모두 trivial ~ small refactor (기존 1350 test + 추가 5+ test).


§1 Problem statement

§1.1 Bug #9: capabilities false negative (Critical)

kebab schema --jsoncapabilities.streaming_askcapabilities.single_file_ingest 가 모두 false hardcoded. 그러나 실제 구현:

  • kebab ask --streamanswer_event.v1 ndjson events 정상 emit (191 events 검증).
  • kebab ingest-file <path>ingest_report.v1 신규/갱신 정상.
  • kebab ingest-stdin --title <T> → 정상.

Impact: MCP host, Claude Code skill 등 agent 가 capabilities: { streaming_ask: false, single_file_ingest: false } 보고 routing 결정 시 false negative. user 가 실제 동작하는 feature 를 사용 불가능하다고 오인.

§1.2 Bug #10: config fail-fast (UX)

kebab search "rust" --config /tmp/nonexistent.toml --json
# exit=0, {"hits":[],"schema_version":"search_response.v1"}

explicit path 가 missing 시 silent fallback to default config (XDG path). debugging nightmare — typo 또는 wrong path 가 0 hit 으로만 surface.

§1.3 Bug #11: OCR timeout 600s (Critical UX)

config.pdf.ocr.request_timeout_secs = 600 (10분/page default). metro-korea.pdf dogfood 증거:

  • page 8 + page 13 에서 Ollama remote 의 slow response → 600s 완전 timeout.
  • 결과: ms: 600000, chars: 0, skipped: true emit → 본문 indexed 안 됨 + 20분 cost waste.

Production impact: 사용자가 ingest 완료 signal 못 받음, 일부 page 검색 불가.

§1.4 Bug #13: schema.models single value (UX)

{
  "chunker_version": "md-heading-v1",
  "parser_version": "md-frontmatter-v2",
  ...
}

그러나 corpus 안 multi-active:

  • parsers: md-frontmatter-v2, pdf-text-v1, code-rust-v1, code-python-v1, none-v1.
  • chunkers: md-heading-v1, pdf-page-v1.1, code-rust-ast-v1, code-python-ast-v1, dockerfile-file-v1, k8s-manifest-resource-v1, manifest-file-v1, code-text-paragraph-v1.

Impact: user 가 kebab schema 보고 active version 식별 불완전, version cascade audit 시 누락 risk.

§1.5 Bug #14: empty query silent (Minor UX)

kebab search "" --json
# exit=0, {"hits":[],"next_cursor":null,"schema_version":"search_response.v1"}

empty query (또는 whitespace-only) 가 silent 0 hit return. user mistake → explicit error 가 정합.


§2 Scope + non-scope

§2.1 Included: 5 bug fix

Bug Category Severity Fix type
#9 wire schema critical capability flag hardcoded boolean → actual feature check
#10 config UX medium silent fallback → error.v1 with config_not_found
#11 OCR config critical default 600s → 60s timeout
#13 wire schema medium single field → additive array fields (backward compat)
#14 input validation minor empty query silent → error.v1 with invalid_input

§2.2 Out of scope

  • Bug #12 (falsified): inspect doc blocks[].text 가 code parser 에서 "?" placeholder. 근본: .text 아님, .code field 정상 emit. user workflow 는 .code 로 접근 가능 → spec 범위 외.
  • dogfood report §12 의 다른 axis (ranking bias, multi-root caveat) → 별도 phase.

§3 Decisions

§3.1 Bug #9: capabilities 정정

Decision: schema.rs::capabilities_snapshot() 의 두 field 를 true 로 update.

fn capabilities_snapshot() -> Capabilities {
    Capabilities {
        json_mode: true,
        ingest_progress: true,
        ingest_cancellation: true,
        rag_multi_turn: true,
        search_cache: true,
        incremental_ingest: true,
        streaming_ask: true,        // ← WAS FALSE, actual TRUE
        http_daemon: false,         // ← preserved (not-impl, separate sub-item)
        mcp_server: true,
        single_file_ingest: true,   // ← WAS FALSE, actual TRUE
        bulk_search: true,
    }
}

Rationale: actual implementation 이 production-grade streaming ask + single-file ingest 지원. schema report 가 reality 와 정합되어야 agent routing 정확함.

§3.2 Bug #10: config_not_found error

Decision: kebab-config 가 자체 error type ConfigNotFound 정의, kebab-app::error_wire 가 classify arm 추가.

Pseudo-code:

// crates/kebab-config/src/lib.rs (또는 적절한 error module)
#[derive(Debug, thiserror::Error)]
#[error("config file does not exist: {path}")]
pub struct ConfigNotFound {
    pub path: PathBuf,
}

// Config::load 안:
pub fn load(opt_path: Option<&Path>) -> anyhow::Result<Config> {
    match opt_path {
        Some(p) if !p.exists() => Err(anyhow::Error::new(ConfigNotFound { path: p.to_path_buf() })),
        Some(p) => Self::from_file(p),
        None => Self::from_xdg_default_or_defaults(),
    }
}

Classify arm in kebab-app/src/error_wire.rs:

if let Some(e) = err.downcast_ref::<kebab_config::ConfigNotFound>() {
    return ErrorV1 {
        schema_version: ERROR_V1_ID.to_string(),
        code: "config_not_found".to_string(),
        message: format!("config file does not exist: {}", e.path.display()),
        details: json!({ "path": e.path }),
        hint: Some("verify --config argument; use --config to point to a writable toml file, or omit to use XDG default".to_string()),
    };
}

Exit code: 2 (config error, not 0 silent).

§3.3 Bug #11: OCR timeout 60s

Decision: default_pdf_ocr_request_timeout_secs() → 600 에서 60 으로 감소.

fn default_pdf_ocr_request_timeout_secs() -> u64 {
    60  // 1 min, production-friendly per dogfood evidence
}

Doc-comment 추가:

/// Default OCR request timeout in seconds. Most pages complete in 6-32s.
/// Set to upper-bound valid throughput; exceeding 60s may indicate
/// Ollama unavailability or very dense/high-res pages.
/// Override via [pdf.ocr] request_timeout_secs = N in config.toml.

§3.4 Bug #13: active_parsers + active_chunkers (additive)

Decision: wire schema additive minor — Models struct 에 두 배열 추가, 기존 single field 보존 (backward compat). kebab-store-sqlite 가 fetch methods 제공.

Store API (crates/kebab-store-sqlite/src/lib.rs):

impl SqliteStore {
    /// SELECT DISTINCT parser_version FROM documents WHERE parser_version IS NOT NULL ORDER BY parser_version
    pub fn fetch_distinct_parser_versions(&self) -> anyhow::Result<Vec<String>> {
        let conn = self.conn()?;
        let mut stmt = conn.prepare(
            "SELECT DISTINCT parser_version FROM documents
              WHERE parser_version IS NOT NULL
              ORDER BY parser_version"
        )?;
        let rows = stmt.query_map([], |row| row.get::<_, String>(0))?;
        let mut out = Vec::new();
        for r in rows { out.push(r?); }
        Ok(out)
    }
    
    pub fn fetch_distinct_chunker_versions(&self) -> anyhow::Result<Vec<String>> {
        let conn = self.conn()?;
        let mut stmt = conn.prepare(
            "SELECT DISTINCT chunker_version FROM chunks
              WHERE chunker_version IS NOT NULL
              ORDER BY chunker_version"
        )?;
        let rows = stmt.query_map([], |row| row.get::<_, String>(0))?;
        let mut out = Vec::new();
        for r in rows { out.push(r?); }
        Ok(out)
    }
}

Models struct (crates/kebab-app/src/schema.rs):

pub struct Models {
    /// Deprecated since v0.20.1. Use active_parsers for multi-parser corpus.
    /// Reports default parser version (markdown path).
    pub parser_version: String,
    
    /// Deprecated since v0.20.1. Use active_chunkers for multi-chunker corpus.
    pub chunker_version: String,
    
    /// All parser versions active in corpus (v0.20.1+). May be empty if corpus is empty.
    pub active_parsers: Vec<String>,
    
    /// All chunker versions active in corpus (v0.20.1+). May be empty if corpus is empty.
    pub active_chunkers: Vec<String>,
    
    pub embedding_version: String,
    pub prompt_template_version: String,
    pub index_version: String,
    pub corpus_revision: u64,
}

Computation (crates/kebab-app/src/schema.rs::collect_models):

let store = open_store_for_stats(cfg)?;
let active_parsers = store.fetch_distinct_parser_versions().unwrap_or_default();
let active_chunkers = store.fetch_distinct_chunker_versions().unwrap_or_default();

Ok(Models {
    parser_version: active_parsers.first().cloned().unwrap_or_else(|| kebab_parse_md::PARSER_VERSION.to_string()),
    chunker_version: active_chunkers.first().cloned().unwrap_or_else(|| kebab_chunk::md_heading_v1::VERSION_LABEL.to_string()),
    active_parsers,
    active_chunkers,
    ...
})

Fallback: markdown-fallback 유지. 기존 parser_version + chunker_version hardcode 보존 (backward compat).

§3.5 Bug #14: empty query validation

Decision: searchask command 모두에 query empty check + error.v1 emit.

Search command (crates/kebab-cli/src/main.rs::search arm):

if let Some(q) = query.as_ref() {
    if q.trim().is_empty() {
        return Err(anyhow::Error::new(kebab_app::StructuredError(ErrorV1 {
            schema_version: ERROR_V1_ID.to_string(),
            code: "invalid_input".to_string(),
            message: "query is empty; provide a non-empty search term or use --bulk".into(),
            details: Value::Null,
            hint: Some("e.g. `kebab search 'rust async'` or `kebab search --bulk < queries.ndjson`".into()),
        })));
    }
}

Ask command (crates/kebab-cli/src/main.rs::ask arm):

if query.trim().is_empty() {
    return Err(anyhow::Error::new(kebab_app::StructuredError(ErrorV1 {
        schema_version: ERROR_V1_ID.to_string(),
        code: "invalid_input".to_string(),
        message: "query is empty; provide a non-empty prompt".into(),
        details: Value::Null,
        hint: Some("e.g. `kebab ask 'explain this code'`".into()),
    })));
}

Both commands now validate; no silent fallback.


§4 Implementation specification

§4.1 Files to modify

  1. Bug #9 capability fix: crates/kebab-app/src/schema.rs

    • line 137151: capabilities_snapshot() — flip streaming_ask: falsetrue, single_file_ingest: falsetrue.
    • add test: capabilities_streaming_ask_matches_cli_surface().
    • add test: capabilities_single_file_ingest_matches_cli_surface().
  2. Bug #10 config_not_found: Two files

    • crates/kebab-config/src/lib.rs:
      • Define ConfigNotFound error struct (with #[derive(Debug, thiserror::Error)]).
      • Modify Config::load(opt_path: Option<&Path>) — path existence check, return Err(anyhow::Error::new(ConfigNotFound { ... })).
      • add test: config_load_explicit_nonexistent_path_returns_error().
    • crates/kebab-app/src/error_wire.rs:
      • Add classify arm after existing ConfigInvalid case.
      • Map kebab_config::ConfigNotFoundErrorV1 { code: "config_not_found", ... }.
  3. Bug #13 schema.models: Three components

    • crates/kebab-store-sqlite/src/lib.rs:
      • Implement fetch_distinct_parser_versions() — SQL SELECT DISTINCT on documents.parser_version + ORDER BY.
      • Implement fetch_distinct_chunker_versions() — SQL SELECT DISTINCT on chunks.chunker_version + ORDER BY.
    • crates/kebab-app/src/schema.rs:
      • Modify Models struct — add active_parsers: Vec<String>, active_chunkers: Vec<String> fields.
      • Modify computation logic (collect_models or equiv) — call store methods, populate arrays, fallback to markdown defaults for single fields.
      • add test: schema_models_active_arrays_empty_on_empty_corpus().
      • add test: schema_models_active_arrays_populated_after_mixed_ingest().
    • docs/wire-schema/v1/schema.schema.json:
      • Models object — add "active_parsers": { "type": "array", "items": { "type": "string" } }.
      • add "active_chunkers": { "type": "array", "items": { "type": "string" } }.
      • Mark deprecated in comment: parser_version + chunker_version (additive, backward compat).
  4. Bug #14 empty query validation: crates/kebab-cli/src/main.rs

    • search command arm: add if query.trim().is_empty() check → error.v1 code=invalid_input.
    • ask command arm: add identical if query.trim().is_empty() check → error.v1 code=invalid_input.
  5. Wire schema v1 doc update: docs/wire-schema/v1/

    • Update schema doc to note active_parsers / active_chunkers optional (additive).
  6. Integration: integrations/claude-code/kebab/SKILL.md

    • Update schema.models surface docs — reference new active_* arrays for multi-version corpora.
  7. Tests (new or extended):

    • crates/kebab-cli/tests/: invalid --config path (absolute + relative) → error.v1 + exit≠0.
    • crates/kebab-cli/tests/: empty query (search + ask) → error.v1 code=invalid_input + exit≠0.
    • crates/kebab-config/tests/: config file not found → ConfigNotFound error.
    • crates/kebab-app/tests/: mixed corpus schema — active_parsers/chunkers include all ingested versions.

§4.2 Regression checks

  • Existing 1350 workspace tests: cargo test --workspace --no-fail-fast -j 1 must pass green.
  • All non-bug capabilities (json_mode, ingest_progress, ingest_cancellation, rag_multi_turn, search_cache, incremental_ingest, mcp_server, bulk_search) stay true.
  • Default config path resolution (no --config) unchanged — silent fallback to XDG only if --config not passed.
  • Relative path behavior (cwd-relative, Rust std path::Path::exists()) preserved.
  • Empty corpus → empty active_parsers / active_chunkers array (not null, not error).
  • Existing hardcoded parser_version + chunker_version fields continue to report markdown defaults (backward compat).
  • Schema version bump not required (wire schema additive minor, backward compat).

§5 Acceptance criteria

# Criterion Evidence
AC-1 kebab schema --json emit streaming_ask: true + single_file_ingest: true cargo test -p kebab-app capabilities_* -j 4 green
AC-2 kebab search "x" --config /nonexistent.toml --json emit exit≠0 + error.v1 code=config_not_found cargo test -p kebab-config config_load_explicit_nonexistent_path_returns_error -j 4 green
AC-3 cargo test -p kebab-config pdf_ocr_request_timeout_default_is_60s -j 4 → green unit test confirms default = 60s (no manual timing)
AC-4 After mixed ingest (MD + PDF + code), kebab schema --json emits both active_parsers + active_chunkers arrays containing all versions integration test pass
AC-5 kebab search "" --json and kebab search " " --json both emit exit≠0 + error.v1 code=invalid_input integration test pass
AC-6 kebab ask "" --json emit exit≠0 + error.v1 code=invalid_input (ask symmetry) integration test pass
AC-7 kebab search "rust" --config nonexistent-relative.toml --json (relative path) emit exit≠0 + error.v1 code=config_not_found integration test pass
AC-8 All 1350+ workspace tests pass; no new failures cargo test --workspace --no-fail-fast -j 1 exit=0
AC-9 Wire schema backward compat: old clients reading parser_version + chunker_version still work; active_* arrays optional per schema JSON schema additionalProperties: false review
AC-10 kebab ask --stream still works; streaming events emitted (no regression) manual `kebab ask --stream "explain this" 2>&1

§6 Risks + resolutions

Risks

  • R-1 (Bug #10): Relative path ./config.toml must resolve from cwd, not from binary location. Resolution: Rust std::path::Path::exists() is cwd-relative; no workaround needed.
  • R-2 (Bug #13): Empty corpus → empty active_parsers / active_chunkers array. Resolution: Unit test schema_models_active_arrays_empty_on_empty_corpus() mandated (AC-4).
  • R-3 (resolved): collect_models uses no cache (every-call re-computation). active_parsers/chunkers reflect corpus state at invocation time. If future caching is added, corpus_revision increment signals invalidation — document at that time.
  • R-4 (Bug #14): ask command validation — covered by same fix (§3.5 mandates both search + ask).
  • R-5 (Bug #11): 60s may still timeout on very dense/high-res pages. Mitigation: User can override via config.toml [pdf.ocr] request_timeout_secs = N. Release notes explicitly call this out.


§7 Parent spec deviation (HOTFIXES handoff)

F-11 MEDIUM finding: parent spec 2026-04-27-kebab-final-form-design.md (frozen) specifies PDF OCR request_timeout_secs = 600s (§1000 + §1628 OQ-1, rationale: "CPU 환경 105s 의 5x 여유"). Bug #11 (dogfood evidence) contradicts — 600s causes timeouts; 60s production-optimal.

Deviation handling:

  1. Parent spec stays frozen (no edits).
  2. HOTFIXES entry (executor Step N): tasks/HOTFIXES.md receives dated entry:
    2026-05-27 — PDF OCR request_timeout_secs default 600s → 60s (v0.20.0 bugfix3 dogfood evidence). Bug #11.
    
  3. Parent spec cross-link (executor Step N): parent spec 2026-04-27-kebab-final-form-design.md receives inline comment at §1000 (default value code block) or §1628 (OQ-1 paragraph):
    <!-- HOTFIX 2026-05-27: default 60s (Bug #11). See tasks/HOTFIXES.md 2026-05-27 entry. -->
    

Parent spec invariant: No changes to parent spec text; only cross-link comment + HOTFIXES.md entry. Frozen design contract preserved.


§8 References

  • Dogfood report — 5 bugs discovered + decisions.
  • Parent spec (frozen contract) — §1, §2, §4 (capabilities, error handling, JSON schema, config XDG).
  • crates/kebab-app/src/schema.rs:137151 (capabilities_snapshot).
  • crates/kebab-config/src/lib.rs (Config::load, default_pdf_ocr_request_timeout_secs).
  • crates/kebab-app/src/error_wire.rs (classify ConfigNotFound).
  • crates/kebab-store-sqlite/src/lib.rs (fetch_distinct_parser_versions, fetch_distinct_chunker_versions).
  • crates/kebab-cli/src/main.rs (search + ask query validation).
  • docs/wire-schema/v1/schema.schema.json (Models + Capabilities objects).
  • tasks/HOTFIXES.md (2026-05-27 entry, Bug #11 deviation record).