3-round dogfood-driven fix cycle 의 산출물: - bugfix1 (Bug #2/#3/#4): spec 964 line + plan 848 line - bugfix2 (Bug #6/#7, #8 falsified): spec 308 line + plan 388 line - bugfix3 (Bug #9/#10/#11/#13/#14, #12 falsified): spec 410 line + plan 1043 line - docs/DOGFOOD.md: 전방위 dogfood checklist 의 전체 (§0 environment ~ §13 reference corpus) 각 round 의 spec/plan 가 critic + verifier round 2 closure ACCEPT 후 frozen. dogfood-driven evidence 기반. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
389 lines
21 KiB
Markdown
389 lines
21 KiB
Markdown
---
|
|
title: "v0.20.0 sub-item 1 bugfix round 2 — plan"
|
|
created: 2026-05-27
|
|
status: "DRAFT round 0"
|
|
spec_path: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix2-spec.md
|
|
spec_status: ACCEPT (round 1c, 308 line)
|
|
critic_round_1: .omc/reviews/2026-05-27-v0.20-bugfix2-spec-critic-r1-result.md
|
|
critic_round_2: .omc/reviews/2026-05-27-v0.20-bugfix2-spec-critic-r2-result.md
|
|
step_count: 4 (3 commit + 1 sanity-only)
|
|
commit_count: 3
|
|
branch: feat/pdf-scanned-ocr
|
|
head_at_draft: e674ff4
|
|
---
|
|
|
|
# v0.20.0 sub-item 1 bugfix round 2 — plan
|
|
|
|
## §0 Overview
|
|
|
|
Spec ACCEPT (round 1c, 9/9 critic finding 반영) 의 implementation map. fix scope = 2 bug:
|
|
|
|
- **Bug #6** (Critical): `?Identity-H Unimplemented?` mojibake marker bypass — `crates/kebab-parse-pdf/src/text_quality.rs::compute_valid_char_ratio()` 의 marker strip + dominance heuristic.
|
|
- **Bug #7** (Minor doc): `kebab search --help` 의 `--media` value list 에서 `code` 누락 — clap doc-comment + SKILL.md 동기 갱신 + CLI help regression test.
|
|
|
|
총 변경 file 5 + 신규 test file 1 + HOTFIXES entry. branch = `feat/pdf-scanned-ocr` (HEAD = `e674ff4`, round 1 의 4 commit 적층 위에 round 2 의 3 commit append). env = `CARGO_TARGET_DIR=/build/out/cargo-target/target`. fresh release binary = `/build/out/cargo-target/target/release/kebab`.
|
|
|
|
**non-scope (spec §2.2 + 본 plan §5 OQ-4)**: spec ACCEPT 의 surface 외에 `crates/kebab-mcp/src/tools/search.rs:44`, `crates/kebab-core/src/search.rs:32+52`, `crates/kebab-app/src/ingest_progress.rs:69`, `crates/kebab-cli/tests/wire_schema_breakdowns.rs:35` 가 같은 stale value list (`markdown, pdf, image, audio, other` — `code` 누락) 을 보유. spec 의 frozen grep boundary (`integrations/` + `crates/kebab-cli/src` + `docs/wire-schema/v1`) 외이므로 본 round 의 commit 대상 X — follow-up issue 권장.
|
|
|
|
---
|
|
|
|
## §1 Step table
|
|
|
|
| Step | Title | Scope summary | Commit subject | Files touched |
|
|
|------|-------|----------------|----------------|----------------|
|
|
| 1 | Bug #6 implementation | `MOJIBAKE_MARKERS` const + `compute_valid_char_ratio()` rewrite + 2 new unit test + HOTFIXES entry | `fix(parse-pdf): strip Identity-H Unimplemented marker + dominance heuristic in compute_valid_char_ratio (Bug #6)` | `crates/kebab-parse-pdf/src/text_quality.rs`, `tasks/HOTFIXES.md` |
|
|
| 2 | Bug #7 doc-comment + SKILL.md | clap doc-comment 의 `--media` value list 에 `code` 추가 + SKILL.md line 57 동기 | `docs(cli): list 'code' in --media help string + SKILL.md (Bug #7)` | `crates/kebab-cli/src/main.rs`, `integrations/claude-code/kebab/SKILL.md` |
|
|
| 3 | Bug #7 CLI help assertion | 신규 test file `crates/kebab-cli/tests/cli_help_smoke.rs` 의 `search_help_lists_code_in_media_values` test | `test(cli): assert 'code' in search --help output (Bug #7 regression pin)` | `crates/kebab-cli/tests/cli_help_smoke.rs` (신규) |
|
|
| 4 | Final sanity (no commit) | workspace test + workspace clippy + optional dogfood retest | — | none |
|
|
|
|
---
|
|
|
|
## §2 Per-step detail
|
|
|
|
### Step 1 — Bug #6 implementation
|
|
|
|
#### §2.1 Files affected
|
|
|
|
- `crates/kebab-parse-pdf/src/text_quality.rs` (현재 103 line — line 1-37 body, line 39-103 tests).
|
|
- `tasks/HOTFIXES.md` (dated entry append).
|
|
|
|
#### §2.2 Action
|
|
|
|
**§2.2.1** — `text_quality.rs` line 1-18 (file header comment + `compute_valid_char_ratio` body) **rewrite** per spec §4.1 의 diff. 추가:
|
|
|
|
- 새 const `MOJIBAKE_MARKERS: &[&str] = &["?Identity-H Unimplemented?"]` (line 8-12 위치, lopdf 0.32.0 source 추적 comment 포함).
|
|
- `compute_valid_char_ratio()` body 의 4-단계 marker strip → trim-empty zero → dominance cap-0.3 → 기존 ratio 계산.
|
|
- `is_valid_text_char()` (line 20-37) **변경 없음** (signature + range list 보존).
|
|
|
|
**§2.2.2** — `text_quality.rs::tests` module (line 39-103) 에 2 신규 test **append**:
|
|
|
|
```rust
|
|
#[test]
|
|
fn identity_h_marker_dominance_caps_ratio_below_threshold() {
|
|
let s = format!("Page 1 of 5 {}", "?Identity-H Unimplemented?".repeat(20));
|
|
let r = compute_valid_char_ratio(&s);
|
|
assert!(r <= 0.3, "marker-dominant mixed page → ratio ≤ 0.3 (OCR fallback); got {r}");
|
|
}
|
|
|
|
#[test]
|
|
fn identity_h_marker_minority_with_long_valid_text_keeps_high_ratio() {
|
|
let header = "x".repeat(200);
|
|
let s = format!("{header} ?Identity-H Unimplemented?");
|
|
let r = compute_valid_char_ratio(&s);
|
|
assert!(r > 0.9, "marker-minority page keeps high ratio; got {r}");
|
|
}
|
|
```
|
|
|
|
**중요 — 스펙 §4.2 wording 보정 (critic r2 NEW-1)**: spec §4.2 의 "Replace existing Bug #6 test set with two new tests" 는 stale wording. 현 `text_quality.rs::tests` 는 8 test 보유, **Identity-H marker 관련 test 0**. 즉 net change = **+2 / -0**. brief §2.1 의 "기존 test `identity_h_marker_mixed_with_some_real_text_low_ratio` 제거" 도 동일 stale — 해당 test 미존재. executor 는 8 existing test (`empty_string_zero`, `pure_ascii_one`, `pure_hangul_syllables_one`, `pure_pua_zero`, `mixed_half`, `cjk_ideograph_valid`, `hangul_jamo_valid`, `f4_fixture_ratio_under_threshold`) 모두 **보존**.
|
|
|
|
**§2.2.3** — `tasks/HOTFIXES.md` 의 latest dated section 위에 entry append:
|
|
|
|
```markdown
|
|
## 2026-05-27 — Identity-H mojibake marker bypassed OCR fallback (Bug #6)
|
|
|
|
- **Symptom**: `metro-korea.pdf` (Identity-H CID font without ToUnicode CMap) 의 ingest 가 `pdf_ocr_pages=0` 으로 종료. text 전체가 `?Identity-H Unimplemented?` marker 1154회 반복 (lopdf 0.32.0 emit). text-detect ratio = 1.0 → OCR fallback threshold 0.5 bypass.
|
|
- **Root cause**: `crates/kebab-parse-pdf/src/text_quality.rs::compute_valid_char_ratio()` 의 `is_valid_text_char()` 가 ASCII printable range (0x0020..=0x007E) 를 unconditional valid 처리. marker (28 ASCII char) 는 valid 로 count.
|
|
- **Fix**: `MOJIBAKE_MARKERS` const 도입 + marker strip after-strip 의 trim-empty → 0.0 + dominance heuristic (strip > 잔여 일 때 cap 0.3). spec ACCEPT: `docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix2-spec.md` §4.1. parser_version/wire schema 영향 0.
|
|
- **User action**: 이미 `metro-korea.pdf` class 의 mojibake-heavy PDF 를 v0.20.0 pre-bugfix2 binary 로 indexed 한 경우, `kebab ingest --force-reingest <workspace>` 로 cached skip 무효화 필요 (release notes 동등 안내).
|
|
```
|
|
|
|
#### §2.3 Acceptance
|
|
|
|
actionable verify command (per-step):
|
|
|
|
```bash
|
|
# A) text_quality 신규 test 2 + 기존 8 = 10 모두 green
|
|
CARGO_TARGET_DIR=/build/out/cargo-target/target cargo test -p kebab-parse-pdf text_quality -j 4 2>&1 | tail -10
|
|
|
|
# B) parse-pdf crate clean compile
|
|
CARGO_TARGET_DIR=/build/out/cargo-target/target cargo build -p kebab-parse-pdf -j 4 2>&1 | tail -3
|
|
|
|
# C) parse-pdf clippy clean (-D warnings)
|
|
CARGO_TARGET_DIR=/build/out/cargo-target/target cargo clippy -p kebab-parse-pdf --all-targets -j 4 -- -D warnings 2>&1 | tail -5
|
|
```
|
|
|
|
기대: A 의 tail = `test result: ok. 10 passed; 0 failed`, B = `Finished`, C = warning 0.
|
|
|
|
#### §2.4 Commit
|
|
|
|
```bash
|
|
git add crates/kebab-parse-pdf/src/text_quality.rs tasks/HOTFIXES.md
|
|
git commit -m "$(cat <<'EOF'
|
|
fix(parse-pdf): strip Identity-H Unimplemented marker + dominance heuristic in compute_valid_char_ratio (Bug #6)
|
|
|
|
Why: metro-korea.pdf (Identity-H CID font without ToUnicode CMap) 의
|
|
ingest 가 pdf_ocr_pages=0 으로 잘못 종료. lopdf 0.32.0 의 emit
|
|
`?Identity-H Unimplemented?` marker 28 ASCII char 가 is_valid_text_char()
|
|
의 0x0020..=0x007E range 통과 → ratio=1.0 → OCR fallback 0.5
|
|
threshold bypass.
|
|
|
|
Change: MOJIBAKE_MARKERS const + compute_valid_char_ratio() 4-단계
|
|
(strip → trim-empty zero → dominance cap-0.3 → 기존 ratio). marker
|
|
list extensible. is_valid_text_char() 본체 변경 0.
|
|
|
|
Tests: +2 unit (dominance + minority) on top of 기존 8. parser_version
|
|
/ wire schema 변경 0.
|
|
|
|
Refs: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix2-spec.md
|
|
§4.1 / §4.2 / §6 R-1.
|
|
EOF
|
|
)"
|
|
```
|
|
|
|
---
|
|
|
|
### Step 2 — Bug #7 doc-comment + SKILL.md
|
|
|
|
#### §2.5 Files affected
|
|
|
|
- `crates/kebab-cli/src/main.rs` line 158-160 (실측: SearchArgs `media` 의 3-line clap doc-comment).
|
|
- `integrations/claude-code/kebab/SKILL.md` line 57.
|
|
|
|
#### §2.6 Action
|
|
|
|
**§2.6.1** — `crates/kebab-cli/src/main.rs` line 158-160 의 doc-comment edit:
|
|
|
|
```diff
|
|
/// p9-fb-36: filter by `assets.media_type` kind. Comma-separated.
|
|
- /// Aliases: `md` → `markdown`. Other accepted: `markdown`, `pdf`,
|
|
- /// `image`, `audio`, `other`. Unknown values match nothing.
|
|
+ /// Aliases: `md` → `markdown`. Other accepted: `markdown`, `pdf`,
|
|
+ /// `image`, `audio`, `code`, `other`. Unknown values match nothing.
|
|
```
|
|
|
|
(critic r2 NEW-2 보정: spec §4.3 의 1-line 표기 vs 실제 3-line clap doc-comment 차이. 실제 file 의 multi-line 분포 그대로 유지하며 line 160 의 `image`, `audio` 사이에 `code` 삽입.)
|
|
|
|
**§2.6.2** — `integrations/claude-code/kebab/SKILL.md` line 57 의 edit:
|
|
|
|
```diff
|
|
-`media` (string array — IN-list of `"markdown"` | `"pdf"` | `"image"` | `"audio"` | `"other"`; alias `"md"` → `"markdown"`)
|
|
+`media` (string array — IN-list of `"markdown"` | `"pdf"` | `"image"` | `"audio"` | `"code"` | `"other"`; alias `"md"` → `"markdown"`)
|
|
```
|
|
|
|
#### §2.7 Acceptance
|
|
|
|
```bash
|
|
# A) cli crate clean compile (doc-comment edit — compile 영향 0 기대)
|
|
CARGO_TARGET_DIR=/build/out/cargo-target/target cargo build -p kebab-cli -j 4 2>&1 | tail -3
|
|
|
|
# B) SKILL.md 의 `code` substring grep
|
|
grep -nF '"code"' integrations/claude-code/kebab/SKILL.md
|
|
|
|
# C) fresh binary 의 search --help 가 `code` 노출
|
|
CARGO_TARGET_DIR=/build/out/cargo-target/target cargo build --release -p kebab-cli -j 4 2>&1 | tail -3
|
|
/build/out/cargo-target/target/release/kebab search --help 2>&1 | grep -F 'code'
|
|
```
|
|
|
|
기대: A = `Finished`, B = line 57 1건 hit, C = `code` 포함 1+ line.
|
|
|
|
#### §2.8 Commit
|
|
|
|
```bash
|
|
git add crates/kebab-cli/src/main.rs integrations/claude-code/kebab/SKILL.md
|
|
git commit -m "$(cat <<'EOF'
|
|
docs(cli): list 'code' in --media help string + SKILL.md (Bug #7)
|
|
|
|
Why: kebab search --media code 가 v0.18.0 부터 functional support 됨
|
|
(MEDIA_KINDS 외 path 로 first-class 처리, schema.v1.media_breakdown.code
|
|
존재). 그러나 SearchArgs 의 clap doc-comment + SKILL.md line 57 의
|
|
value list 가 stale — `code` 누락. user 가 --help 만 보고 code 미지원이라
|
|
오해 가능.
|
|
|
|
Change: 2 surface 동기 — main.rs line 158-160 의 multi-line clap
|
|
doc-comment + integrations/claude-code/kebab/SKILL.md line 57.
|
|
Rust binary surface / wire schema 변경 0.
|
|
|
|
Out of scope (follow-up): crates/kebab-mcp/tools/search.rs:44,
|
|
crates/kebab-core/src/search.rs:32+52, crates/kebab-app/src/
|
|
ingest_progress.rs:69, crates/kebab-cli/tests/wire_schema_breakdowns.rs:35
|
|
도 동일 stale list 보유. spec ACCEPT (round 1c) 의 grep boundary
|
|
밖이므로 본 round 미포함.
|
|
|
|
Refs: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix2-spec.md
|
|
§4.3 / §4.3a.
|
|
EOF
|
|
)"
|
|
```
|
|
|
|
---
|
|
|
|
### Step 3 — Bug #7 CLI help assertion test
|
|
|
|
#### §2.9 Files affected
|
|
|
|
- `crates/kebab-cli/tests/cli_help_smoke.rs` (신규 — 기존 file list 에 미존재).
|
|
|
|
#### §2.10 Action
|
|
|
|
신규 file 생성. 기존 test convention (`cli_*` prefix, `Command::new(env!("CARGO_BIN_EXE_kebab"))` pattern — 참고: `cli_readonly_quiet.rs`, `cli_schema.rs`) 답습:
|
|
|
|
```rust
|
|
// crates/kebab-cli/tests/cli_help_smoke.rs
|
|
//
|
|
// Regression pin — `kebab search --help` 의 `--media` value list 가
|
|
// `code` 를 노출. Bug #7 (v0.20.0 bugfix round 2 spec §4.4).
|
|
|
|
#[test]
|
|
fn search_help_lists_code_in_media_values() {
|
|
let out = std::process::Command::new(env!("CARGO_BIN_EXE_kebab"))
|
|
.args(["search", "--help"])
|
|
.output()
|
|
.expect("kebab search --help");
|
|
let stdout = String::from_utf8_lossy(&out.stdout);
|
|
assert!(
|
|
stdout.contains("`code`"),
|
|
"search --help must list 'code' as accepted --media value; stdout = {stdout}"
|
|
);
|
|
}
|
|
```
|
|
|
|
#### §2.11 Acceptance
|
|
|
|
```bash
|
|
# A) 신규 test target 빌드 + 실행
|
|
CARGO_TARGET_DIR=/build/out/cargo-target/target cargo test -p kebab-cli --test cli_help_smoke -j 4 2>&1 | tail -10
|
|
|
|
# B) cli crate tests target clean compile (전체)
|
|
CARGO_TARGET_DIR=/build/out/cargo-target/target cargo build -p kebab-cli --tests -j 4 2>&1 | tail -3
|
|
|
|
# C) cli clippy clean (-D warnings) — 신규 test file 포함
|
|
CARGO_TARGET_DIR=/build/out/cargo-target/target cargo clippy -p kebab-cli --all-targets -j 4 -- -D warnings 2>&1 | tail -5
|
|
```
|
|
|
|
기대: A = `test result: ok. 1 passed; 0 failed`, B = `Finished`, C = warning 0.
|
|
|
|
#### §2.12 Commit
|
|
|
|
```bash
|
|
git add crates/kebab-cli/tests/cli_help_smoke.rs
|
|
git commit -m "$(cat <<'EOF'
|
|
test(cli): assert 'code' in search --help output (Bug #7 regression pin)
|
|
|
|
Why: Step 2 의 doc-comment edit 가 향후 누군가 value list 를 재정렬
|
|
하거나 alias section 으로 분리할 때 silently 사라질 risk. clap 의
|
|
--help 렌더링 가 doc-comment 의 free-form text 라 grep-only smoke 가
|
|
유일한 검출 수단.
|
|
|
|
Change: 신규 test file (kebab-cli convention `cli_*` prefix 답습).
|
|
CARGO_BIN_EXE_kebab 으로 fresh binary 실행, stdout 의 `code` substring
|
|
assert. spec §4.4 의 acceptance row 1:1 mapping.
|
|
|
|
Refs: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix2-spec.md §4.4
|
|
/ §5 (acceptance row 4).
|
|
EOF
|
|
)"
|
|
```
|
|
|
|
---
|
|
|
|
### Step 4 — Final sanity (no commit)
|
|
|
|
#### §2.13 Scope
|
|
|
|
3 commit append 후 workspace 전수 verify + optional dogfood. **commit 발생 X** (코드 변경 0 — verification only).
|
|
|
|
#### §2.14 Acceptance
|
|
|
|
```bash
|
|
# A) workspace test 전수 — 기존 1316 + 본 round +2 unit + +1 cli = 1319 expected
|
|
CARGO_TARGET_DIR=/build/out/cargo-target/target cargo test --workspace --no-fail-fast -j 1 2>&1 | tee /tmp/v0.20-bugfix2-test.log | tail -15
|
|
echo "exit = ${PIPESTATUS[0]:-$?}"
|
|
|
|
# B) workspace clippy clean (-D warnings)
|
|
CARGO_TARGET_DIR=/build/out/cargo-target/target cargo clippy --workspace --all-targets -j 4 -- -D warnings 2>&1 | tail -8
|
|
|
|
# C) (optional) dogfood retest — metro-korea.pdf
|
|
# binary 의 fresh build 가 이미 Step 2 acceptance 에서 완료.
|
|
# --force-reingest 후 pdf_ocr_pages 가 0 → 21+ 변화 관찰.
|
|
# OCR latency ≈ 10 min cost — plan drafter 가 executor 에게 optional 명시.
|
|
# 실측 corpus 가 user-private (KEBAB_WORKSPACE 또는 ~/Documents/test/) 이면 skip 가능.
|
|
```
|
|
|
|
기대: A = `test result: ok. <N> passed; 0 failed` (N ≥ 1319), B = warning 0, C = 사용자 선택 (verifier round 0 에서 평가).
|
|
|
|
#### §2.15 Commit
|
|
|
|
없음 (sanity-only). executor 가 sanity green 확인 후 PR push 단계로 진행.
|
|
|
|
---
|
|
|
|
## §3 Verifier checklist (cumulative)
|
|
|
|
spec §5 의 7 row acceptance criteria 와 1:1 mapping. verifier round 0 의 actionable command:
|
|
|
|
| # | Spec §5 criterion | Verifier command | Step coverage | Pass condition |
|
|
|---|-------------------|------------------|---------------|----------------|
|
|
| 1 | `identity_h_marker_dominance_caps_ratio_below_threshold` green | `cargo test -p kebab-parse-pdf identity_h_marker_dominance_caps_ratio_below_threshold -j 4 2>&1 \| tail -3` | Step 1 | `1 passed; 0 failed` |
|
|
| 2 | `identity_h_marker_minority_with_long_valid_text_keeps_high_ratio` green | `cargo test -p kebab-parse-pdf identity_h_marker_minority_with_long_valid_text_keeps_high_ratio -j 4 2>&1 \| tail -3` | Step 1 | `1 passed; 0 failed` |
|
|
| 3 | 기존 text_quality 8 test green (regression 0) | `cargo test -p kebab-parse-pdf text_quality -j 4 2>&1 \| tail -5` | Step 1 | `10 passed; 0 failed` (8 기존 + 2 신규) |
|
|
| 4 | `search_help_lists_code_in_media_values` green | `cargo test -p kebab-cli --test cli_help_smoke -j 4 2>&1 \| tail -3` | Step 3 | `1 passed; 0 failed` |
|
|
| 5 | SKILL.md 의 `"code"` substring 존재 | `grep -nF '"code"' integrations/claude-code/kebab/SKILL.md` | Step 2 | line 57 1 hit |
|
|
| 6 | workspace test 전수 green | `cargo test --workspace --no-fail-fast -j 1 2>&1 \| tail -10` | Step 4 | `0 failed`, N ≥ 1319 |
|
|
| 7 | workspace clippy clean (-D warnings) | `cargo clippy --workspace --all-targets -j 4 -- -D warnings 2>&1 \| tail -5` | Step 4 | warning 0 |
|
|
| 8 (optional) | dogfood retest — metro-korea.pdf 의 `pdf_ocr_pages` 0 → 21+ | manual: `kebab ingest --force-reingest <ws>` 후 ingest_report.v1 의 `items[].pdf_ocr_pages` 검사 | Step 4 | `pdf_ocr_pages > 0` for metro-korea.pdf row |
|
|
|
|
executor 는 row 1-7 모두 green 시 PR push gate 통과. row 8 = verifier round 0 의 optional (사용자 corpus 가용성 + 10 min cost 평가).
|
|
|
|
---
|
|
|
|
## §4 Risks resolution (spec §6 의 plan-level)
|
|
|
|
| ID | Spec §6 status | Plan-level action |
|
|
|----|----------------|--------------------|
|
|
| R-1 | resolved per critic r1 (lopdf 0.32.0 = marker 1 entry) | 본 plan §2.2.1 의 source comment 가 lopdf upgrade 시 re-verify trigger. |
|
|
| R-2 | resolved (`trim().is_empty()` cover) | Step 1 implementation 의 §2.2.1 4-단계 중 2-단계 = trim-empty zero. |
|
|
| R-3 | resolved (wire schema 변경 0) | parser_version `"pdf-text-v1"` / chunker_version `"pdf-page-v1.1"` 보존. version cascade 영향 0 (CLAUDE.md §Versioning cascade). |
|
|
| R-4 | resolved per critic r1 (grep boundary = `integrations/` + `crates/kebab-cli/src` + `docs/wire-schema/v1`) | Step 2 가 spec 범위 내 2 surface 모두 커버. **추가 발견 (out of scope)** → §5 OQ-4. |
|
|
| R-5 | resolved (`bulk.rs:161` alias normalize 통해 영향 0) | 본 plan 동작 변경 0. |
|
|
|
|
추가 risk — 본 plan drafter 가 식별:
|
|
|
|
- **R-6 (NEW)**: Step 4 의 optional dogfood retest 가 `KEBAB_WORKSPACE` 또는 user-private corpus 의존. CI 환경에서 verify 불가 — verifier round 0 가 evidence 부재 시 row 8 skip 명시 권고.
|
|
|
|
---
|
|
|
|
## §5 Open questions for executor
|
|
|
|
spec ACCEPT 가 명확하므로 OQ-1/2/3 모두 resolved. 본 plan drafter 가 추가 식별:
|
|
|
|
- **OQ-4 (NEW)**: spec §2.2 의 R-4 grep 결과, frozen boundary 외부 surface 가 동일 stale list 보유:
|
|
- `crates/kebab-mcp/src/tools/search.rs:44` — MCP tool 의 `--media` doc.
|
|
- `crates/kebab-core/src/search.rs:32` — `MEDIA_KINDS` const = `&["markdown", "pdf", "image", "audio", "other"]`. 주의: 이 const 가 functional 일 수 있음 — `code` 는 v0.18.0 부터 separate path 로 first-class 처리 (`schema.v1.media_breakdown.code` 존재 확인 per spec §1.2). const 자체 수정은 behavior change risk 동반 → 별도 spec 으로 분리.
|
|
- `crates/kebab-core/src/search.rs:52` — `MediaFilter::media` doc-comment.
|
|
- `crates/kebab-app/src/ingest_progress.rs:69` — progress label doc-comment.
|
|
- `crates/kebab-cli/tests/wire_schema_breakdowns.rs:35` — test fixture array (functional, 변경 시 test 의미 영향).
|
|
|
|
**executor action**: 본 round 미포함. PR description 또는 Step 2 commit body 에 "follow-up: open issue for stale --media value list in 5 additional surfaces" 한 줄 명시 권장.
|
|
|
|
- **OQ-5 (NEW)**: spec §6 의 UX consequence — pre-bugfix2 v0.20.0 user 의 `--force-reingest` 권고가 release notes 에 들어가야 하며, 별도 phase (PR review/merge 시점) 의 작업. 본 plan 의 Step 1 §2.2.3 HOTFIXES entry 가 user-facing surface 의 일부 — release notes 가 HOTFIXES 의 user action 항목을 인용 가능.
|
|
|
|
---
|
|
|
|
## §6 References
|
|
|
|
- **Spec ACCEPT (parent contract)**: `docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix2-spec.md` (308 line, round 1c).
|
|
- **Critic round 1**: `.omc/reviews/2026-05-27-v0.20-bugfix2-spec-critic-r1-result.md` (H-1 + M-1/M-2/M-3 + L-1/L-2 + NIT-1/NIT-2 + invariant audit, 9 finding 모두 spec 에 반영).
|
|
- **Critic round 2**: `.omc/reviews/2026-05-27-v0.20-bugfix2-spec-critic-r2-result.md` (NEW-1 = §4.2 stale arithmetic, NEW-2 = §4.3 scope description drift — 본 plan §2.2.2 + §2.6.1 에 정정 반영).
|
|
- **Plan drafter brief**: `.omc/reviews/2026-05-27-v0.20-bugfix2-plan-drafter-brief.md`.
|
|
- **Parent design**: `docs/superpowers/specs/2026-05-27-pdf-scanned-ocr-spec.md` §1.3 (text-detect threshold metric), §9 (version cascade).
|
|
- **Round 1 history**: branch `feat/pdf-scanned-ocr` HEAD = `e674ff4`, 4 commit 적층 (Bug #2 source-fs, Bug #3 chunk_id collision, Bug #3 test, Bug #4 pikepdf F4 fixture).
|
|
- **Code locations (line 실측)**:
|
|
- `crates/kebab-parse-pdf/src/text_quality.rs:1-103` (전체 file).
|
|
- `crates/kebab-cli/src/main.rs:158-160` (SearchArgs `media` clap doc-comment, 3-line multi-line attribute).
|
|
- `integrations/claude-code/kebab/SKILL.md:57` (search input filter 설명).
|
|
- `crates/kebab-cli/tests/cli_help_smoke.rs` (신규, Step 3).
|
|
- **External source**: `lopdf-0.32.0/src/document.rs:523` (`Document::decode_text` — sole emitter of `?Identity-H Unimplemented?`).
|
|
|
|
---
|
|
|
|
## §7 Constraints (spec §9 + brief §9)
|
|
|
|
1. **branch 변경 0** — plan 자체는 documentation only. 본 file = plan deliverable.
|
|
2. **spec ACCEPT frozen** — round 1c body 보수 X. 본 plan 의 §2.2 / §2.6 의 wording 정정 (`Replace existing` → `+2 / -0 additive`) 은 plan 의 local note 로 명문, spec 본문 미변경.
|
|
3. **regression 0** — workspace test N ≥ 1319.
|
|
4. **wire schema / version cascade 변경 0** — `parser_version="pdf-text-v1"`, `chunker_version="pdf-page-v1.1"` 보존.
|
|
5. **subagent skip** — executor 가 in-session 단일 thread 실행 (worker protocol per task assignment).
|
|
6. **lightweight scope** — 본 plan 의 line target = 200-400 (round 1 plan = 849 line 의 1/3 미만).
|
|
|
|
**Status**: DRAFT round 0 — verifier review 대기.
|