Files
kebab/docs/superpowers/plans/2026-05-27-v0.20-sub1-bugfix-plan.md
altair823 46e99470eb docs(superpowers): v0.20 sub-item 1 bugfix1/2/3 specs + plans + DOGFOOD.md
3-round dogfood-driven fix cycle 의 산출물:

- bugfix1 (Bug #2/#3/#4): spec 964 line + plan 848 line
- bugfix2 (Bug #6/#7, #8 falsified): spec 308 line + plan 388 line
- bugfix3 (Bug #9/#10/#11/#13/#14, #12 falsified): spec 410 line + plan 1043 line
- docs/DOGFOOD.md: 전방위 dogfood checklist 의 전체 (§0 environment ~ §13 reference corpus)

각 round 의 spec/plan 가 critic + verifier round 2 closure ACCEPT 후 frozen. dogfood-driven evidence 기반.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 01:21:34 +00:00

850 lines
62 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: v0.20.0 sub-item 1 bugfix — implementation plan
created: 2026-05-27
status: ACCEPT (round 2 closure — Phase B complete)
target_version: 0.20.0 (PR #189 force-update)
spec: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix-spec.md
contract_sections: ["§9 (chunker_version cascade)"]
parent_plan: docs/superpowers/plans/2026-05-27-pdf-scanned-ocr-plan.md
review_history:
- 2026-05-27 plan round 0 (opus, drafter) — 5 step group A-E, 18 sub-action
- 2026-05-27 plan round 1 critic (opus, thorough) — NEEDS_DISCUSSION, HIGH 1 + MEDIUM 2 + LOW 3 + NIT 1 (7 finding)
- 2026-05-27 plan round 1 verifier (opus, thorough) — NEEDS_DISCUSSION, HIGH 4 + MEDIUM 3 + LOW 4 + NIT 3 (14 finding)
- 2026-05-27 plan round 1c rewrite (opus, drafter) — 21 finding 모두 적용 (critic 7 + verifier 14). detail = §8 round 1c rewrite changelog
- 2026-05-27 plan round 2 closure critic (opus) — ACCEPT, 21/21 applied + 4 NIT cosmetic
---
# v0.20.0 sub-item 1 bugfix plan
> ACCEPT 된 spec (`docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix-spec.md`, 965 lines, round 2 closure) 의 step decomposition. 3 bug (#2 walker code limit / #3 chunk_id collision Critical / #4 F4 fixture Pages tree) 의 force-update path (PR #189 base branch `feat/pdf-scanned-ocr` 위에 fix commits stack). **5 step (Group A-E), 18 sub-action, 4 commit + 1 verify-only step (= 5 step total, 4 commit boundary).** spec §6 의 16-row acceptance 가 본 plan 의 §4 verifier checklist 로 1:1 mapping.
## §0 Pre-flight + branch state
- **Branch**: `feat/pdf-scanned-ocr` (PR #189 base, HEAD = `b4d9e60` "chore(release): bump version 0.19.0 → 0.20.0"). 사용자 메모리 `feedback_pr_workflow` 따라 force-update path — 같은 branch 위에 fix commits 추가 + PR #189 force-push.
- **Working dir**: `/home/altair823/kebab`.
- **Env 강제** (`~/.claude/CLAUDE.md` "Disk Layout — 루트 디스크 보호가 최우선"):
- `export CARGO_TARGET_DIR=/build/out/cargo-target/target` — XFS 4TB 전용 디스크 격리, repo root `target/` 생성 방지.
- `export RELEASE_BIN="${CARGO_TARGET_DIR:-target}/release/kebab"` — release binary alias (Step 5 dogfood 의 모든 acceptance command 에서 사용).
- `export TMPDIR=/build/cache/tmp` — 대용량 임시 파일 보호.
- **Cargo build 직렬화** (memory `feedback_serial_build_only`):
- per-crate: `-j 4` default (예: `cargo test -p kebab-chunk -j 4`).
- workspace: `-j 1` 강제 (`cargo test --workspace -j 1`, `cargo clippy --workspace -j 1` — 18 integration-test binary 동시 link 시 OOM).
- **`target/` clean policy** (memory `feedback_cargo_clean_policy`): `/build` XFS 4TB 분리라 routinely clean 금지. `df -h /build``Avail < 500G` OR `du -sh $CARGO_TARGET_DIR` > 200G 시에만 clean. Step 5 E1 의 first cargo invoke 직전 1 회 conditional check, 임계 미달 시 skip + commit body 안 "skipped cargo clean — /build avail X TB" 1줄 record.
- **dogfood KB layout 가정** (Step 5 E3 prerequisite, critic round 1 H-1 closure): canonical config path = `/build/cache/tmp/v0.20-dogfood/config.toml` (in-place, KB dir 안). 외부 backup file `/build/cache/tmp/v0.20-dogfood-config.toml`**존재 안 함** — 본 plan 의 모든 acceptance command 는 in-place config 기준. Step 5 E3 의 KB clean 은 **destructive `rm -rf` 금지**, config 보존 selective clean 사용 (E3 detail 참조). dogfood config canonical path 는 본 §0 의 한 곳에서만 정의 — Step 5 E3 의 command 가 이 path 참조.
- **HOTFIXES.md / README / HANDOFF / ARCHITECTURE 영향**: Step 2 B4 가 HOTFIXES.md entry 추가 (Bug #3 second-iteration patch). 그 외 사용자 visible surface 변경 0 — README + HANDOFF + ARCHITECTURE 갱신 0 (CLI flag / wire schema / TUI key / config 추가 0; chunker_version bump 은 internal cascade 라 release notes 만).
- **wire schema 변경 0** — `ingest_progress.v1` + `ingest_report.v1` 추가 field 0. V00X migration 0. `chunks` table DDL unchanged.
- **frozen design contract 변경 0** — design §9 cascade rule 자체 변경 0 (rule 의 직접 적용으로 chunker_version 만 bump).
- **workspace version bump 0** — v0.20.0 이 이미 cut (commit `b4d9e60`). 본 plan 은 같은 v0.20.0 안의 cumulative bugfix (PR #189 force-update). Step 5 E5 의 PR force-push 만, release tag 재컷 0.
## §1 Plan overview + spec linkage
Spec §3 (Bug #2) + §4 (Bug #3) + §5 (Bug #4) 의 fix design 을 atomic step 으로 decompose. 핵심 sequencing:
1. **Bug #2 walker code limit fix** (Step 1) — `is_code_file` helper + walker conditional + unit test. spec §3.4 + §3.5 의 diff 그대로 적용. 1 commit.
2. **Bug #3 chunk_id fix + chunker_version bump** (Step 2) — `chunk_page` return tuple 4-tuple 확장 + caller `per_chunk_hash` suffix 를 `segment_start` 로 변경 + `VERSION_LABEL` `"pdf-page-v1"``"pdf-page-v1.1"` bump + module doc 갱신 + HOTFIXES.md entry + unit regression test. spec §4.4 + §4.4.1 + §4.5 의 diff 그대로 적용. 1 commit.
3. **Bug #3 integration test** (Step 3) — `crates/kebab-app/tests/` 안 multi-scanned PDF chunk_id collision-free integration test. spec §4.5.1 의 MockOcrEngine pre-condition 결정 (Option A share 또는 Option B inline) 이 executor 의 first sub-action. 1 commit.
4. **Bug #4 F4 fixture re-generation** (Step 4) — `tests/fixtures/_synth/mojibake.py` 의 pikepdf-based rewrite + F4 fixture binary regenerate + parse-pdf 의 3 신규 invariant test. spec §5.4 + §5.5 + §5.6 의 diff 그대로 적용. 1 commit.
5. **Workspace verify + commit + PR force-push** (Step 5) — cargo workspace test `-j 1` + clippy `-D warnings` + dogfood re-run (`/build/cache/tmp/v0.20-dogfood` isolated KB, qwen2.5vl:3b 의 Ollama endpoint `192.168.0.47:11434`) + PR #189 force-push. spec §6 16-row consolidated acceptance 가 본 step 의 verifier checklist.
ordering invariant:
- **Step 1 || Step 2 || Step 4 mutually independent**: 3 bug 의 fix 가 서로 다른 crate (`kebab-source-fs` / `kebab-chunk` / `tests/fixtures` + `kebab-parse-pdf`) 의 file path 에 한정 — 동시 진행 가능. 정합성 우선 → Step 1 → Step 2 → Step 4 sequential.
- **Step 2 < Step 3**: integration test 가 `kebab-chunk` 의 fix 된 chunk_id 계산 path 위에 의존. Step 2 의 GREEN 이 prerequisite.
- **Step 4 < Step 5 dogfood**: F4 fixture regeneration 의 결과 binary 가 dogfood 의 9 PDF 중 1 (mojibake) — Step 5 E3 dogfood 의 `block_count: 1` invariant 검증 prerequisite.
- **Step 1-4 all < Step 5 workspace test**: workspace 전체 test 가 production code + test 의 final state 위에서만 의미.
commit 단위는 logical group 1 commit (atomic) — §7 sequencing summary 의 5-commit table 따름. 사용자 memory `feedback_pr_workflow` (gitea-pr + 리뷰 루프) 따라 force-update 후 `gitea-pr-review` skill 의 review 루프 진입.
---
## §2 Step group structure (Group A-E)
| Step | Group | 분류 | sub-action |
|---:|---|---|---|
| 1 | A | Bug #2 walker code limit fix | A1 `is_code_file` helper + A2 walker conditional + A3 unit test |
| 2 | B | Bug #3 chunk_id collision fix + chunker_version bump | B1 `chunk_page` 4-tuple + B2 caller `per_chunk_hash` + B3 `VERSION_LABEL` bump + B4 module doc + HOTFIXES.md + B5 unit regression test |
| 3 | C | Bug #3 multi-scanned PDF integration test | C1 MockOcrEngine share decision + C2 integration test (conditional) |
| 4 | D | Bug #4 F4 fixture re-generation | D1 mojibake.py pikepdf rewrite + D2 fixture regenerate + commit + D3 parse-pdf 3 invariant test |
| 5 | E | Workspace verify + commit + PR force-push | E1 cargo workspace test -j 1 + E2 clippy -D warnings + E3 dogfood re-run + E4 commit + E5 PR #189 force-push |
---
## §3 Per-step detail
### Step 1 (Group A): Bug #2 walker code limit fix
spec §3 의 Option A (code path only) — `is_oversized` 호출을 `is_code_file(path)` conditional 로 gate. PDF/image/markdown 의 size 는 parser 단계 자체 검증 (lopdf load_mem 256 KB+ 정상, image OCR 의 max_pixels self-cap).
#### Sub-action A1 — `is_code_file` helper 추가
- **Files affected**: `crates/kebab-source-fs/src/code_meta.rs` (line 129 `is_oversized` 함수 직후, 또는 `code_lang_for_path` 정의 직후).
- **Action** (spec §3.4 diff 그대로):
```rust
/// Returns true when `path`'s filename/extension is recognised as a code
/// file (per `code_lang_for_path`). Used by the walker to apply
/// `[ingest.code].max_file_bytes` / `max_file_lines` only to code files,
/// not to PDF/image/markdown (which have their own size controls in
/// their respective parsers).
pub(crate) fn is_code_file(path: &Path) -> bool {
code_lang_for_path(path).is_some()
}
```
- **Acceptance**:
- `grep -c "pub(crate) fn is_code_file" crates/kebab-source-fs/src/code_meta.rs` = **1**.
- `cargo build -p kebab-source-fs -j 4` green.
#### Sub-action A2 — walker conditional size check
- **Files affected**: `crates/kebab-source-fs/src/connector.rs:168-190` (현재 verified line range).
- **Action** (spec §3.4 diff 그대로 — `is_oversized` 호출 앞에 `is_code_file` short-circuit):
```diff
- // Size-cap check (byte or line limit).
- if crate::code_meta::is_oversized(
- &abs_path,
- self.max_file_bytes,
- self.max_file_lines,
- )
- .unwrap_or(false)
+ // v0.20.0 sub-item 1 bugfix (#2): size-cap applies ONLY to
+ // code files. PDF/image/markdown bypass — their parsers
+ // have their own size controls. spec §3.3.
+ if crate::code_meta::is_code_file(&abs_path)
+ && crate::code_meta::is_oversized(
+ &abs_path,
+ self.max_file_bytes,
+ self.max_file_lines,
+ )
+ .unwrap_or(false)
{
fs_skips.skipped_size_exceeded =
fs_skips.skipped_size_exceeded.saturating_add(1);
...
tracing::debug!(
path = %rel_path.display(),
max_bytes = self.max_file_bytes,
max_lines = self.max_file_lines,
- "skip: file exceeds size cap"
+ "skip: code file exceeds size cap"
);
continue;
}
```
- **Acceptance**:
- `grep -nE "is_code_file\(&abs_path\)\s*$" crates/kebab-source-fs/src/connector.rs` ≥ **1**.
- `grep -c "skip: code file exceeds size cap" crates/kebab-source-fs/src/connector.rs` ≥ **1**.
- `cargo build -p kebab-source-fs -j 4` green.
#### Sub-action A3 — Bug #2 unit test 추가
- **Files affected**: `crates/kebab-source-fs/src/connector.rs` 의 기존 `#[cfg(test)] mod tests` (spec §3.5 "기존 test module 에 추가" 명시 — 새 file 아님).
- **Action** (spec §3.5 의 `size_cap_skips_only_code_files` test body 그대로):
- 300 KB PDF / 300 KB markdown / 300 KB `big.rs` (3 file) tempdir 합성.
- `FsSourceConnector` (`max_file_bytes = 262_144`, `max_file_lines = 5_000`) 의 `scan_with_skips(&SourceScope::default())`.
- assertions:
- `paths.contains("paper.pdf")` (PDF walker pass).
- `paths.contains("notes.md")` (Markdown walker pass).
- `!paths.contains("big.rs")` (code file walker skip).
- `skips.skip_examples.size_exceeded` 안 `big.rs` 1 entry, `paper.pdf` 0 entry.
- cfg helper: 기존 test module 의 `cfg_with_size_cap(root, max_bytes, max_lines)` 패턴 재사용 (필요 시 helper 추가).
- **Acceptance**:
- `cargo test -p kebab-source-fs size_cap_skips_only_code_files -j 4` green.
- 기존 `ingest_report_counts_oversized_files_by_bytes` (fixture `huge.rs`) + `ingest_report_size_cap_by_line_count` (fixture `longfile.rs`) regression 0 — fixture 명이 `.rs` 라 새 conditional 통과 (invariant preserved).
- `cargo test -p kebab-source-fs -j 4` 전체 green.
#### Commit (Step 1 전체)
```
fix(source-fs): apply size limit only to code files; PDF/image/markdown bypass walker cap (Bug #2)
- crates/kebab-source-fs/src/code_meta.rs: add pub(crate) fn is_code_file
- crates/kebab-source-fs/src/connector.rs: walker conditional `is_code_file && is_oversized`
- crates/kebab-source-fs/src/connector.rs mod tests: size_cap_skips_only_code_files unit test
- spec: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix-spec.md §3
```
---
### Step 2 (Group B): Bug #3 chunk_id collision fix + chunker_version bump
spec §4.3 의 Option A (segment boundary `start` 를 `per_chunk_hash` suffix 로). `chunk_page` return tuple 을 3-tuple `(actual_start, chunk_end, slice)` → 4-tuple `(segment_start, actual_start, chunk_end, slice)` 로 확장 + caller `per_chunk_hash` suffix 를 `segment_start` 로 변경. `VERSION_LABEL` `"pdf-page-v1"` → `"pdf-page-v1.1"` bump (spec §4.4.1 round 1c M-1 decision — explicit cascade audit trail).
#### Sub-action B1 — `chunk_page` 4-tuple expansion
- **Files affected**: `crates/kebab-chunk/src/pdf_page_v1.rs:200-204` (doc comment) + `:205-289` (signature line 205 → closing `}` line 289). 본 critic round 1 + verifier round 1 의 actual probe 결과 정정 (L-1).
- **Action** (spec §4.4 diff 그대로):
- doc comment 갱신 — `(char_start, char_end, text_slice)` → `(segment_start, actual_start, chunk_end, text_slice)`:
```rust
/// Split a single page's text into ordered chunks, each represented as
/// `(segment_start, actual_start, chunk_end, text_slice)`.
///
/// - `segment_start` = pre-overlap segment boundary. Strictly increasing
/// across the returned vec. Use this for chunk_id uniqueness suffixes.
/// - `actual_start` = post-overlap start char index. May collapse to a
/// previous chunk's `actual_start` under aggressive overlap policy.
/// Use this for `SourceSpan::Page::char_start`.
/// - `chunk_end` = chunk's end char index (exclusive).
fn chunk_page(text: &str, target_bytes: usize, overlap_bytes: usize)
-> Vec<(usize, usize, usize, String)>
```
- early return: `vec![(0, n, text.to_string())]` → `vec![(0, 0, n, text.to_string())]`.
- loop body 의 push: `chunks.push((actual_start, chunk_end, slice))` → `chunks.push((start, actual_start, chunk_end, slice))`. (`start = bounds[seg_idx]` 는 이미 local var 로 존재 — line 245.)
- overlap walk 의 `let prev_min = prev.0` 가 기존 tuple 의 첫 field = post-fix tuple shape 에서는 `prev.1` (actual_start) — spec §4.4 의 invariant 보존 위해 변경:
```diff
- let actual_start = if let Some(prev) = chunks.last() {
- let prev_min = prev.0;
+ let actual_start = if let Some(prev) = chunks.last() {
+ // prev tuple shape = (segment_start, actual_start, chunk_end, slice).
+ // overlap walk floor = previous chunk's actual_start (prev.1).
+ let prev_min = prev.1;
...
```
- **Acceptance**:
- `grep -nE "fn chunk_page.*-> Vec<\(usize, usize, usize, String\)>" crates/kebab-chunk/src/pdf_page_v1.rs` = **1**.
- `grep -c "let prev_min = prev.1" crates/kebab-chunk/src/pdf_page_v1.rs` ≥ **1**.
- `cargo build -p kebab-chunk -j 4` green (caller B2 sub-action 동시 적용 후 red 해소).
#### Sub-action B2 — caller `per_chunk_hash` suffix → `segment_start`
- **Files affected**: `crates/kebab-chunk/src/pdf_page_v1.rs:149-186` (현재 verified — `chunk` method 의 `for (...) in chunk_page(...)` loop start line 149 → loop end line 186, verifier round 1 L-2 정정).
- **Action** (spec §4.4 diff 그대로):
```diff
- for (char_start, char_end, slice) in
- chunk_page(&p.text, target_bytes, overlap_bytes)
+ for (segment_start, char_start, char_end, slice) in
+ chunk_page(&p.text, target_bytes, overlap_bytes)
{
...
let span = SourceSpan::Page {
page: page_num,
char_start: Some(char_start_u32),
char_end: Some(char_end_u32),
};
let block_ids: Vec<BlockId> = vec![p.common.block_id.clone()];
- // Per-chunk policy_hash variant prevents chunk_id
- // collision when a page produces multiple chunks. See
- // module docs for rationale.
- let per_chunk_hash = format!("{base_policy_hash}#c{char_start}");
+ // v0.20.0 sub-item 1 bugfix (#3): per-chunk policy_hash
+ // variant uses `segment_start` (pre-overlap boundary,
+ // strictly increasing) instead of `char_start` (post-
+ // overlap, may collapse to prev_min). See module docs +
+ // spec §4.1 root cause + HOTFIXES.md 2026-05-27.
+ let per_chunk_hash = format!("{base_policy_hash}#c{segment_start}");
let chunk_id =
id_for_chunk(&doc.doc_id, &chunker_version, &block_ids, &per_chunk_hash);
...
}
```
- `SourceSpan::Page.char_start` 는 여전히 post-overlap `char_start` (= `actual_start`) 보존 — citation locality semantic 유지.
- **Acceptance** (verifier round 1 M-2: B2+B4 가 같은 logical commit 안 → grep 시점 = Step 2 commit time, 즉 post-B4):
- `grep -c "#c{segment_start}" crates/kebab-chunk/src/pdf_page_v1.rs` ≥ **1** (B2 단독 적용 시 = 1 call site; B4 module doc 적용 후 = 2 — B4 acceptance 가 ≥ 2 검증).
- `grep -c "#c{char_start}" crates/kebab-chunk/src/pdf_page_v1.rs` = **0** (call site + module doc 모두 segment_start 로 교체 — B2+B4 의 same-commit consolidated invariant).
- sub-action-by-sub-action 분리 검증 시 B2 단독 grep `#c{char_start}` 는 module doc line 56 의 literal 잔존으로 ≥ 1 — Step 2 commit boundary 도달 후 = 0 으로 확정.
#### Sub-action B3 — `VERSION_LABEL` bump `"pdf-page-v1"` → `"pdf-page-v1.1"` + hardcoded literal 2 site 갱신
- **Files affected** (verifier round 1 H-1 의 actual probe `grep -rn '"pdf-page-v1"' crates/ --include='*.rs'` 결과 2 site enumerate):
- `crates/kebab-chunk/src/pdf_page_v1.rs:67` (현재 verified — `const VERSION_LABEL: &str = "pdf-page-v1";`).
- `crates/kebab-app/tests/pdf_pipeline.rs:168` (현재 verified — `assert_eq!(pdf_item.chunker_version.as_ref().map(|c| c.0.as_str()), Some("pdf-page-v1"))` hard assertion, v1.1 bump 후 fail).
- `crates/kebab-app/tests/pdf_pipeline.rs:368` (현재 verified — error message string literal `"pdf-page-v1 emits 0 chunks for the empty page; total = 2"`, hard assertion 아니지만 stale 방지).
- **Action** (spec §4.4.1 결정):
- **(a) primary const bump** (`crates/kebab-chunk/src/pdf_page_v1.rs:67`):
```diff
-const VERSION_LABEL: &str = "pdf-page-v1";
+const VERSION_LABEL: &str = "pdf-page-v1.1";
```
기존 test `chunker_version_is_pdf_page_v1` (pdf_page_v1.rs:374) 의 assertion 은 `VERSION_LABEL` const 인용 → 자동 갱신, test code 변경 불요.
- **(b) test assertion literal 갱신** (`crates/kebab-app/tests/pdf_pipeline.rs:168`, required):
```diff
- Some("pdf-page-v1")
+ Some("pdf-page-v1.1")
```
- **(c) test error message literal 갱신** (`crates/kebab-app/tests/pdf_pipeline.rs:368`, recommended):
```diff
- "pdf-page-v1 emits 0 chunks for the empty page; total = 2"
+ "pdf-page-v1.1 emits 0 chunks for the empty page; total = 2"
```
- **Acceptance**:
- `grep -nE 'const VERSION_LABEL: &str = "pdf-page-v[0-9.]+";' crates/kebab-chunk/src/pdf_page_v1.rs` 결과 = `"pdf-page-v1.1"`.
- `cargo test -p kebab-chunk chunker_version_is_pdf_page_v1 -j 4` green (VERSION_LABEL 인용이라 자동 통과).
- `grep -rn '"pdf-page-v1"' crates/ --include='*.rs' | grep -v 'pdf-page-v1\.1'` = 결과 **0** (regex 의 false-positive 방지 — `pdf-page-v1.1` 의 substring `"pdf-page-v1"` 은 ".1" suffix 로 exclude). `grep -v` filter 후 line 0 이면 stale literal 잔존 0.
- `cargo test -p kebab-app pdf_pipeline -j 4` green (line 168 assertion 갱신 후).
#### Sub-action B4 — module doc 갱신 + HOTFIXES.md entry
- **Files affected**:
- `crates/kebab-chunk/src/pdf_page_v1.rs:47-60` (현재 verified — module doc `## chunk_id collision deviation` 단락).
- `tasks/HOTFIXES.md` (new dated entry append, 기존 entry 위치 — file 의 latest entry 가 `2026-05-26` 이므로 그 위에 `2026-05-27 — v0.20.0 sub-item 1` entry insert; 본 file 의 chronological pattern 따름).
- **Action**:
- **(a) module doc** — spec §4.4 의 갱신본 그대로:
```diff
-//! Workaround that doesn't change the §4.2 recipe: feed a per-chunk
-//! variant `format!("{base_policy_hash}#c{char_start}")` into the
-//! recipe's `policy_hash` slot (so distinct chunks distinguish via
-//! different policy_hash inputs), while storing the unmodified
-//! `base_policy_hash` in `Chunk.policy_hash` so the field still answers
-//! "what policy was active". Logged in `tasks/HOTFIXES.md`.
+//! Workaround that doesn't change the §4.2 recipe: feed a per-chunk
+//! variant `format!("{base_policy_hash}#c{segment_start}")` into the
+//! recipe's `policy_hash` slot. `segment_start` is the pre-overlap
+//! segment boundary, strictly increasing across the returned chunks
+//! even when the overlap walk collapses `actual_start` to a previous
+//! chunk's `prev_min`. Unmodified `base_policy_hash` is stored in
+//! `Chunk.policy_hash` so the field still answers "what policy was
+//! active". v1.1 second-iteration patch — logged in
+//! `tasks/HOTFIXES.md` (2026-05-27).
```
- **(b) HOTFIXES.md entry** (spec §4.4 의 entry body 그대로):
```markdown
## 2026-05-27 — v0.20.0 sub-item 1: chunk_id `#c{char_start}` workaround collapses under aggressive overlap (Bug #3 second-iteration patch)
**Symptom**: F2 (1580 chars OCR, scanned_page2.pdf) ingest 시
`DocumentStore::put_chunks (pdf): sqlite error: UNIQUE constraint
failed: chunks.chunk_id: ... Error code 1555: A PRIMARY KEY constraint
failed`. `kebab v0.20.0` (commit `b4d9e60`) dogfood (qwen2.5vl:3b 의
`192.168.0.47:11434` Ollama endpoint, `/build/cache/tmp/v0.20-dogfood`
isolated KB) `--force-reingest` 마다 reproducible.
**Root cause**: `crates/kebab-chunk/src/pdf_page_v1.rs:170` 의
`per_chunk_hash = format!("{base_policy_hash}#c{char_start}")` 에서
`char_start` = post-overlap `actual_start`. line 266-281 의 overlap
walk 가 `prev_min` floor 까지만 back-walk 하므로 aggressive overlap
+ 첫 segment 가 작은 page (F2 의 한국어 OCR text: 첫 ~10 char 안
sentence-end → segment_1 = [0, 30], segment_2 = [30, n], overlap_bytes
240 / chars=80 → segment_2 의 actual_start 가 prev_min=0 으로
collapse) → 두 chunk 의 `#c0` suffix identical → identical chunk_id →
`chunks` PRIMARY KEY violation.
**Fix** (spec §4.4): `chunk_page` return tuple 에 `segment_start`
추가 (3-tuple → 4-tuple `(segment_start, actual_start, chunk_end,
slice)`), caller `per_chunk_hash` 의 suffix 를 `segment_start` 로
변경. `segment_start` 는 `bounds[seg_idx]` (dedup 후 strictly
increasing) — overlap walk 와 무관하게 모든 chunk distinct. citation
locality 의 `SourceSpan::Page.char_start` 는 여전히 post-overlap
`actual_start` 유지.
**chunker_version cascade**: `pdf-page-v1` → `pdf-page-v1.1` bump
(spec §4.4.1 round 1c M-1 결정, design §9 cascade rule 의 직접 적용).
multi-chunk PDF page (pre-OCR 시점 `metro-korea.pdf` 의 21 block /
34 chunk 같은 정상 path) 의 chunk_id 가 변경 — explicit user-facing
audit trail 확보, store layer 의 자동 invalidation report. v0.20.0
force-update path 라 사용자 cost zero (어차피 fresh ingest).
**Amends**: spec `docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix-spec.md`
§4.4. parent design §4.2 chunk_id recipe 자체 unchanged (workaround
layer 의 internal computation 만 변경). parent PR #189
(`feat/pdf-scanned-ocr`, force-update path).
```
- **Acceptance**:
- `grep -c "#c{segment_start}" crates/kebab-chunk/src/pdf_page_v1.rs` ≥ **2** (module doc + line 170 의 actual call).
- `grep -c "2026-05-27 — v0.20.0 sub-item 1: chunk_id" tasks/HOTFIXES.md` = **1**.
#### Sub-action B5 — unit regression test `multi_chunk_page_with_aggressive_overlap_produces_unique_chunk_ids`
- **Files affected**: `crates/kebab-chunk/src/pdf_page_v1.rs` 의 `#[cfg(test)] mod tests` (현재 verified — `make_pdf_doc(&[&str])` + `default_policy(target, overlap)` helper 이미 존재, line 300-371).
- **Action** (spec §4.5 의 test body 그대로):
```rust
#[test]
fn multi_chunk_page_with_aggressive_overlap_produces_unique_chunk_ids() {
// 한국어 OCR text 의 trigger shape: 10 char "가" + ". " + 500 char "나".
// → first segment [0, 12), second segment [12, n).
// page_text byte_len = 10*3 + 2 + 500*3 = 1532 > target_bytes=1500
// → multi-chunk. overlap_bytes = min(240, 750) = 240 chars=80
// → second chunk 의 actual_start 가 prev_min=0 collapse → same `#c0`.
//
// default_policy(500, 80) — target_tokens=500 → target_bytes=500*3=1500
// (한국어 3byte/char 환산), overlap_tokens=80 → overlap_bytes=min(240, 750)=240.
// verifier round 1 L-3 보강.
let early_seg: String = std::iter::repeat('가').take(10).collect();
let tail: String = std::iter::repeat('나').take(500).collect();
let page_text = format!("{early_seg}. {tail}");
let doc = make_pdf_doc(&[&page_text]);
let policy = default_policy(500, 80); // target=1500 byte, overlap=240 byte
let chunks = PdfPageV1Chunker.chunk(&doc, &policy).unwrap();
assert!(
chunks.len() >= 2,
"expected ≥2 chunks for {} byte page; got {}",
page_text.len(),
chunks.len()
);
let mut ids: Vec<&str> = chunks.iter().map(|c| c.chunk_id.0.as_str()).collect();
ids.sort_unstable();
let total = ids.len();
ids.dedup();
assert_eq!(
ids.len(),
total,
"all chunk_ids must be unique even when overlap walks actual_start back to prev_min"
);
}
```
- **Acceptance**:
- `cargo test -p kebab-chunk multi_chunk_page_with_aggressive_overlap_produces_unique_chunk_ids -j 4` green.
- `cargo test -p kebab-chunk deterministic_chunk_ids_1000 -j 4` green (기존 determinism invariant 보존).
- `cargo test -p kebab-chunk overlap_clamped_when_overlap_exceeds_target -j 4` green (기존 overlap clamp invariant 보존).
- `cargo test -p kebab-chunk -j 4` 전체 green.
#### Commit (Step 2 전체)
```
fix(chunk): chunk_id collision under aggressive overlap; bump pdf-page-v1 → pdf-page-v1.1 (Bug #3)
- crates/kebab-chunk/src/pdf_page_v1.rs: chunk_page returns 4-tuple
(segment_start, actual_start, chunk_end, slice); caller per_chunk_hash
suffix uses segment_start (pre-overlap boundary, strictly increasing)
instead of char_start (post-overlap, may collapse to prev_min).
- VERSION_LABEL "pdf-page-v1" → "pdf-page-v1.1" (design §9 cascade,
explicit user-facing audit trail).
- module docs: workaround description updated to segment_start.
- mod tests: multi_chunk_page_with_aggressive_overlap_produces_unique_chunk_ids
regression pin.
- tasks/HOTFIXES.md: 2026-05-27 entry (symptom F2 1580 char OCR,
intra-doc collision root cause, second-iteration patch rationale).
- spec: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix-spec.md §4
```
---
### Step 3 (Group C): Bug #3 multi-scanned PDF integration test
spec §4.5 + §4.5.1 — `kebab-app` integration 수준의 chunk_id collision-free regression. real Ollama 회피 위해 `OcrEngine` trait 의 MockOcrEngine. 기존 `crates/kebab-app/tests/pdf_ocr_apply.rs:20-45` 의 private MockOcrEngine 가 같은 crate 의 별 test binary 라 직접 import 불가 — executor 의 first sub-action 으로 share path 결정.
#### Sub-action C1 — MockOcrEngine share decision (executor 의 dependency 확인 task)
- **Files affected** (Option 별 분기):
- **Option A (share via `tests/common/`)** — verifier round 1 H-2 의 actual probe 결과 정정:
- `crates/kebab-app/tests/common/mod.rs` 는 **이미 존재** (172 line `TestEnv` infrastructure, `#![allow(dead_code)]` + `pub struct TestEnv` + `pub fn ingest_md` + `pub fn lexical_query` 등). action = **`pub mod mock_ocr;` 1줄 append** (mod.rs 신규 X).
- `crates/kebab-app/tests/common/mock_ocr.rs` (**신규** file, MockOcrEngine lift + per-page ctor).
- 기존 `pdf_ocr_apply.rs:20-45` 의 MockOcrEngine struct + impl 제거 + `mod common; use common::mock_ocr::MockOcrEngine;` import 추가 + ctor call site migration (M-3 참조).
- 신규 integration test 가 `mod common; use common::mock_ocr::MockOcrEngine;` 으로 share.
- **Option B (inline 중복)**: 신 test file `multi_scanned_pdf_ingest_no_chunk_id_collision.rs` 안에 inline `struct LocalMockOcr` + `impl OcrEngine for LocalMockOcr` (test isolation 우선, common/mod.rs touch X).
- **Action**:
- **(a) dependency probe** — spec §4.5.1 의 결정 path 따름:
```bash
grep -rn "impl OcrEngine" crates/kebab-parse-image/src/ crates/kebab-app/tests/
# 실 결과:
# crates/kebab-parse-image/src/ocr.rs:235 — production OllamaVisionOcr.
# crates/kebab-app/tests/pdf_ocr_apply.rs:25 — test-only MockOcrEngine.
ls crates/kebab-app/tests/common/mod.rs
# 실 결과: -rw-r--r-- ... 172 line (TestEnv infrastructure 이미 존재).
```
- **(b) executor 결정**:
- 기존 MockOcrEngine 의 ctor 가 `MockOcrEngine { expected_text: String, fail: bool }` — per-page 다른 text 길이 지원 위해 ctor signature 확장 필요 (예: `expected_text: Vec<String>` + internal `Mutex<usize>` cursor). 확장이 trivial + 두 test 가 같은 crate → **Option A 권장**.
- Option A 시 `pdf_ocr_apply.rs` 의 MockOcrEngine ctor 호출 site (현재 실 verifier probe = **10 instantiation site** at lines 140, 170, 193, 210, 242, 284, 311, 334, 359, 399 — critic round 1 L-2 의 "9 → 10" off-by-1 정정. struct define line 21 제외) 가 새 ctor signature 로 migration — backward-compat 위해 두 ctor (`MockOcrEngine::single(text, fail)` + `MockOcrEngine::per_page(texts, fail)`) 제공. **mechanical migration**: 각 site 의 `MockOcrEngine { expected_text: <text>, fail: <bool> }` → `MockOcrEngine::single(<text>, <bool>)` (10 site × 1 line edit, verifier round 1 M-3 의 actual cost).
- Option B (inline) 는 sharing 비용 > test 격리 가치 시. 본 plan 의 first preference = Option A.
- **(c) 결정 결과 record**: result file (`.omc/reviews/2026-05-27-v0.20-bugfix-plan-drafter-r1c-result.md`) 의 closing summary 의 §6 open question 1 에 결정 path 기록 — Option A 시 sub-action C2 의 file edit = (existing) `common/mod.rs` append 1 line + (new) `common/mock_ocr.rs` + (modify) `pdf_ocr_apply.rs` + (new) `multi_scanned_pdf_ingest_no_chunk_id_collision.rs` = 4 file. Option B 시 1 new file 만.
- **Acceptance**:
- probe grep 결과 ≥ 2 line (production + existing mock).
- probe ls 결과 — `common/mod.rs` existing 확인.
- executor 의 결정이 plan 의 §6 open question OQ-1 안에 명시.
#### Sub-action C2 — integration test 작성 (conditional on C1 결정)
- **Files affected** (Option A 채택 가정, verifier round 1 H-2 정정):
- `crates/kebab-app/tests/common/mod.rs` (**existing** 172 line — `pub mod mock_ocr;` 1줄 append 만).
- `crates/kebab-app/tests/common/mock_ocr.rs` (**신규** — MockOcrEngine lift + per-page ctor).
- `crates/kebab-app/tests/pdf_ocr_apply.rs:20-45` (기존 inline impl 제거 + `mod common; use common::mock_ocr::MockOcrEngine;` add — file head 의 mod declaration 1 줄 추가) + ctor call site 10 개 mechanical migration (M-3).
- `crates/kebab-app/tests/multi_scanned_pdf_ingest_no_chunk_id_collision.rs` (**신규**) — `mod common; use common::mock_ocr::MockOcrEngine;` import.
- **Action** (spec §4.5 의 test body — 본 plan 의 sub-action 안 expanded):
- **fixture**: F1 (`scanned_page1.pdf`, 779 char OCR) + F2 (`scanned_page2.pdf`, 1580 char OCR) + 1 synthetic small-page PDF (300 char) — 3 scanned PDF.
- **MockOcrEngine ctor**: per-page text vec `["text for F1", "text for F2 의 1580 char string", "text for synthetic 300 char"]` + `fail: false`.
- **isolated KB**: `tempfile::tempdir()` + `Config::default()` 의 `data_dir` 만 override + workspace `[ingest.pdf].enabled = true`.
- **assertion path**:
1. `kebab_app::ingest_with_config_opts(&cfg, ...)` (facade) 호출.
2. `report.items.iter().filter(|i| i.kind == IngestItemKind::Error).count() == 0` — chunk_id collision 시 발생할 `ErrorKind::Storage` row 부재.
3. `store.get_chunks_count() == sum(per-PDF chunk_counts)` — DELETE+INSERT path 의 final row count.
4. `store.get_all_chunk_ids().iter().collect::<HashSet<_>>().len() == chunks_count` — chunk_id global uniqueness.
- **executor degradation path** (spec §4.5.1 conditional downgrade): 만약 Option A 의 share 가 비용/위험 크고 Option B 도 비현실적 (예: integration setup 의 ExtractContext / Facade wiring 가 본 sub-action scope 초과) → §6 row 7 의 acceptance 를 conditional downgrade — `kebab-chunk` 의 unit-level invariant (Step 2 B5) 만으로 Bug #3 의 core regression 핀 확보, integration 회피.
- **Acceptance**:
- `cargo test -p kebab-app multi_scanned_pdf_ingest_no_chunk_id_collision -j 4` green.
- `cargo test -p kebab-app pdf_ocr_apply -j 4` green (existing test regression 0 — `MockOcrEngine { expected_text, fail }` literal struct construction 10 ctor site 가 `MockOcrEngine::single(text, fail)` 로 migration 후, critic round 1 L-2 actual count).
- downgrade path 시: result file + commit body 안 "§6 row 7 conditional skip — Bug #3 core regression = kebab-chunk unit B5" 1줄 record.
#### Commit (Step 3 전체)
```
test(app): multi-scanned PDF chunk_id collision-free integration test (Bug #3 regression)
- crates/kebab-app/tests/common/{mod,mock_ocr}.rs: MockOcrEngine lift
with per-page text ctor (shared by pdf_ocr_apply.rs + new test).
- crates/kebab-app/tests/multi_scanned_pdf_ingest_no_chunk_id_collision.rs:
3 scanned PDF (F1 + F2 + synthetic 300char) ingest via mock OCR,
assert all chunk_ids globally unique + zero ErrorKind::Storage rows.
- spec: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix-spec.md §4.5
```
Option B (inline) 또는 conditional downgrade 채택 시 commit body 와 file list 그에 맞춰 조정.
---
### Step 4 (Group D): Bug #4 F4 fixture re-generation
spec §5 — `tests/fixtures/_synth/mojibake.py` 의 byte-level `re.sub` + 수작업 startxref edit 를 pikepdf 의 proper PDF surgery (open + delete /ToUnicode + save 자동 xref regen) 로 교체. F4 fixture 자체 (`crates/kebab-parse-pdf/tests/fixtures/mojibake.pdf`) regenerate + 3 신규 invariant test.
#### Sub-action D1 — `tests/fixtures/_synth/mojibake.py` pikepdf rewrite
- **Files affected**: `tests/fixtures/_synth/mojibake.py` (전체 rewrite — 기존 byte-edit 패턴 폐기).
- **Action** (spec §5.4 의 body 그대로):
- Step 1: reportlab 으로 Type 0 (CID) font 사용 한국어 PDF 합성 (정상 ToUnicode CMap 포함).
- Step 2: pikepdf 로 open + 모든 dictionary 의 `/ToUnicode` entry 제거 + `pdf.save(allow_overwriting_input=True)` (xref 자동 regen).
- Step 3: invariant 검증 — `len(pdf.pages) == 1` + `b"/ToUnicode" not in dst.read_bytes()`.
- 실패 시 비-zero exit code + stderr message (Step 2 의 removed count = 0 → exit 2; Step 3 의 page count mismatch → exit 3; ToUnicode 잔존 → exit 4).
- **Dep install** (executor 의 pre-action):
```bash
pip install --cache-dir /build/cache/pip pikepdf reportlab
python -c "import pikepdf; import reportlab; print(pikepdf.__version__, reportlab.Version)"
# font availability probe (critic round 1 L-3) — mojibake.py 의 hardcode path.
test -f /usr/share/fonts/truetype/dejavu/DejaVuSans.ttf \
|| sudo apt-get install -y fonts-dejavu-core
```
CI 환경 미반영 — fixture 자체를 commit 하므로 generation 은 1회성 (Step 4 D2 의 executor local). `tasks/HOTFIXES.md` 에 pikepdf install hint 만 1줄 추가 가능.
- **Acceptance**:
- `grep -c "import pikepdf" tests/fixtures/_synth/mojibake.py` = **1**.
- `grep -c "re.sub" tests/fixtures/_synth/mojibake.py` = **0** (byte-edit 패턴 폐기 확인).
- `test -f /usr/share/fonts/truetype/dejavu/DejaVuSans.ttf` exit 0 (font probe, critic round 1 L-3 fast failover signal).
- `python tests/fixtures/_synth/mojibake.py /tmp/mojibake_dryrun.pdf && echo OK` exit 0 + stderr 무.
#### Sub-action D2 — F4 fixture binary regenerate + snapshot regen + commit
- **Files affected** (verifier round 1 H-4 + critic round 1 M-2 의 actual probe `grep -rn 'fixtures/mojibake.pdf' crates/` 결과 2 consumer enumerate):
- `crates/kebab-parse-pdf/tests/fixtures/mojibake.pdf` (regenerate).
- `crates/kebab-parse-pdf/tests/snapshots/vector_pdf_canonical.json` (**snapshot baseline file 자체** — delete + auto-regen). verifier round 1 H-4 의 actual probe `text_extractor_regression.rs:59-64` 의 hand-rolled `unwrap_or_else { write baseline }` 패턴.
- `crates/kebab-parse-pdf/tests/text_extractor_regression.rs` (existing test — 코드 자체 변경 0, snapshot regen path 만 trigger).
- `crates/kebab-parse-pdf/src/text_quality.rs:96` (verifier round 1 H-4 의 2번째 consumer — `let bytes = include_bytes!("../tests/fixtures/mojibake.pdf");` 의 unit test/doctest 가 fixture binary 변경 시 동시 verify, 코드 변경 0).
- **Action**:
- **(a) regenerate command**:
```bash
python tests/fixtures/_synth/mojibake.py \
crates/kebab-parse-pdf/tests/fixtures/mojibake.pdf
```
- **(b) regenerate 후 manual probe**:
```bash
python -c "import pikepdf; pdf = pikepdf.open('crates/kebab-parse-pdf/tests/fixtures/mojibake.pdf'); print(len(pdf.pages))"
# expected: 1
grep -c "/ToUnicode" crates/kebab-parse-pdf/tests/fixtures/mojibake.pdf
# expected: 0 (binary grep — Pages dict 안 ToUnicode 부재)
```
- **(c) snapshot baseline regen** (verifier round 1 H-4 + critic round 1 M-2 의 actual mechanic — OQ-2 closure):
- `text_extractor_regression.rs:59-64` 는 `let baseline = std::fs::read_to_string("tests/snapshots/vector_pdf_canonical.json").unwrap_or_else(|_| { std::fs::write(baseline_path, &actual).expect(...); actual.clone() })` 의 hand-rolled pattern (insta crate 사용 X).
- fixture binary 변경 → 다음 cargo test 시 `actual != baseline` → `assert_eq!` fail.
- executor 의 regen step:
```bash
rm crates/kebab-parse-pdf/tests/snapshots/vector_pdf_canonical.json
cargo test -p kebab-parse-pdf vector_pdf_extract_byte_identical_to_baseline -j 4
# 1st run: snapshot file 부재 → unwrap_or_else write 패턴이 새 baseline 작성 → assert pass.
cargo test -p kebab-parse-pdf vector_pdf_extract_byte_identical_to_baseline -j 4
# 2nd run: 새 baseline 와 byte-identical → assert pass (regression invariant 확립).
```
- OQ-2 closure — insta crate 미사용, cargo-insta CLI 불요. spec §5.6 의 "기존 `text_extractor_regression.rs` 의 F4 baseline 갱신" 의 actual mechanic 명문화.
- **Acceptance**:
- `stat crates/kebab-parse-pdf/tests/fixtures/mojibake.pdf` size > 0.
- `grep -c "/ToUnicode" crates/kebab-parse-pdf/tests/fixtures/mojibake.pdf` = **0**.
- python 의 page count probe = `1`.
- `test -f crates/kebab-parse-pdf/tests/snapshots/vector_pdf_canonical.json` exit 0 (snapshot regen 후).
- `cargo test -p kebab-parse-pdf vector_pdf_extract_byte_identical_to_baseline -j 4` 2회 연속 green (regen + verify, H-4 mechanic).
- `cargo test -p kebab-parse-pdf -j 4` 전체 green — D3 의 신규 test + text_quality.rs:96 의 2번째 consumer 도 동시 verify.
#### Sub-action D3 — parse-pdf 의 3 신규 invariant test
- **Files affected** (verifier round 1 H-3 의 actual probe 결과 결정):
- actual `ls crates/kebab-parse-pdf/tests/` 결과: `common`, `extractor.rs`, `fixtures`, `ocr_e2e.rs`, `page_image.rs`, `snapshots`, `text_extractor_regression.rs` — plan round 0 의 primary candidate `text_extractor.rs` 는 **존재 안 함**.
- **결정**: `crates/kebab-parse-pdf/tests/text_extractor_regression.rs` append (F4 fixture consumer locality + D2 snapshot regen mechanic 와 same file). 3 신규 `#[test] fn` append.
- 대안 (executor 가 file size / cohesion 고려해 split 결정 시): 신규 `crates/kebab-parse-pdf/tests/mojibake_invariants.rs`. plan first preference = append to `text_extractor_regression.rs`.
- **Action** (spec §5.5 의 3 test body 의 path 정정 — verifier round 1 H-3):
1. `mojibake_fixture_load_yields_one_page` — `let bytes = include_bytes!("fixtures/mojibake.pdf");` (integration test 는 이미 `crates/kebab-parse-pdf/tests/` root, `text_extractor_regression.rs:42` 의 canonical pattern 따름; spec §5.5 의 `"../tests/fixtures/mojibake.pdf"` 가 잘못 — `"fixtures/mojibake.pdf"` 직접). `lopdf::Document::load_mem(bytes).unwrap().get_pages().len() == 1`.
2. `mojibake_fixture_has_no_tounicode_cmap` — CWD-relative `std::fs::read("tests/fixtures/mojibake.pdf")` 위험 회피 (cargo test 의 CARGO_MANIFEST_DIR ≠ CWD 환경 가능): `let bytes = include_bytes!("fixtures/mojibake.pdf");` 사용. `bytes.windows(b"/ToUnicode".len()).filter(|w| *w == b"/ToUnicode").count() == 0`.
3. `pdf_text_extractor_on_mojibake_yields_one_block` — `let bytes = include_bytes!("fixtures/mojibake.pdf");` + PdfTextExtractor 의 `1 Block::Paragraph per page` invariant 검증, `canonical.blocks.len() == 1`, `scanned candidate` warning 또는 non-empty text. ExtractContext setup 의 actual body 는 executor 가 `text_extractor_regression.rs` 의 existing helper (있을 시) 또는 spec §5.5 의 placeholder 의 expansion.
- **Acceptance**:
- `cargo test -p kebab-parse-pdf mojibake_fixture_load_yields_one_page -j 4` green.
- `cargo test -p kebab-parse-pdf mojibake_fixture_has_no_tounicode_cmap -j 4` green.
- `cargo test -p kebab-parse-pdf pdf_text_extractor_on_mojibake_yields_one_block -j 4` green.
- `cargo test -p kebab-parse-pdf -j 4` 전체 green.
#### Commit (Step 4 전체)
```
fix(parse-pdf): F4 mojibake.pdf via pikepdf surgery; preserve 1-page invariant (Bug #4)
- tests/fixtures/_synth/mojibake.py: full rewrite — replace byte-level
re.sub + manual startxref edit with pikepdf open+del+save (auto xref
regen). Type 0 font + ToUnicode strip via dictionary walk.
- crates/kebab-parse-pdf/tests/fixtures/mojibake.pdf: regenerate.
- crates/kebab-parse-pdf/tests/text_extractor_regression.rs: append 3
invariant tests (lopdf 1-page / no ToUnicode marker / PdfTextExtractor
1-block) — verifier round 1 H-3 의 file path decision (same locality
with snapshot regen).
- crates/kebab-parse-pdf/tests/snapshots/vector_pdf_canonical.json:
delete + auto-regen via 2-run cargo test (hand-rolled unwrap_or_else
pattern, verifier round 1 H-4).
- spec: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix-spec.md §5
```
verifier round 1 NIT-2 정정 — commit scope `test-fixture` → `parse-pdf` (crate name, conventional commit typical scope).
---
### Step 5 (Group E): Workspace verify + commit + PR #189 force-push
spec §6 의 16-row consolidated acceptance 를 본 step 의 verifier checklist. 모든 acceptance command 가 scriptable.
#### Sub-action E1 — cargo workspace test `-j 1`
- **Files affected**: 변경 0 (verification only).
- **Action**:
- **(a) conditional cargo clean** — memory `feedback_cargo_clean_policy` (verifier round 1 NIT-1 정정: TB vs GB unit 혼동 회피 위해 `-BG` 으로 GB unit 강제):
```bash
# /build avail 을 GB 단위 정수로 직접 가져옴 (df -BG output 의 'G' suffix 만 strip).
AVAIL_GB=$(df -BG --output=avail /build | tail -1 | tr -d ' G')
# CARGO_TARGET_DIR 의 size 도 GB 단위 정수로 (du -BG output).
TARGET_GB=$(du -BG -s "${CARGO_TARGET_DIR:-target}" 2>/dev/null | awk '{print $1}' | tr -d 'G')
# /build avail < 500 GB OR target > 200 GB → clean
if [[ "${AVAIL_GB:-9999}" -lt 500 ]] || [[ "${TARGET_GB:-0}" -gt 200 ]]; then
cargo clean
fi
```
임계 미달 시 skip + commit body / result file 안 1줄 record (예: "skipped cargo clean — /build avail ${AVAIL_GB}G, target ${TARGET_GB}G").
- **(b) workspace test**:
```bash
cargo test --workspace --no-fail-fast -j 1 2>&1 | tail -100
```
- tail 100 line + final summary "test result: ok. N passed; 0 failed" 확인.
- **Acceptance**:
- exit code 0.
- stdout 의 "test result: ok" + "0 failed".
- spec §6 row 14 (workspace full test pass) 충족.
#### Sub-action E2 — `cargo clippy --workspace -- -D warnings`
- **Files affected**: 변경 0.
- **Action**:
```bash
cargo clippy --workspace --all-targets -j 1 -- -D warnings 2>&1 | tail -50
```
- **Acceptance**:
- exit code 0.
- "warning" 키워드 0 (or `-D warnings` 가 자동 error 화).
- spec §6 row 13 (workspace clippy clean) 충족.
#### Sub-action E3 — dogfood re-run (Ollama qwen2.5vl:3b 환경)
- **Files affected**: 변경 0. `/build/cache/tmp/v0.20-dogfood/` (isolated KB, 동일 dogfood 재사용).
- **Action** (memory `feedback_pr_workflow` + `_external/` invariant 따름):
- **(a) release build**:
```bash
cargo build --release -p kebab-cli -j 4 2>&1 | tail -10
"${CARGO_TARGET_DIR:-target}/release/kebab" --version
# expected: kebab 0.20.0
```
- **(b) dogfood KB clean + re-ingest 9 PDF** (spec §1.1 의 dogfood 환경 동일).
canonical config path = `/build/cache/tmp/v0.20-dogfood/config.toml` (§0 가정). 외부 backup file `/build/cache/tmp/v0.20-dogfood-config.toml` 은 **존재 안 함** — critic round 1 H-1 의 actual probe 결과. 따라서 config 의 **자체 backup 후 clean + restore** path 사용 (destructive `rm -rf` 시 config 동시 삭제 방지):
```bash
# Step A: config 의 임시 backup (KB clean 전 보존).
cp /build/cache/tmp/v0.20-dogfood/config.toml \
/build/cache/tmp/v0.20-dogfood-config.toml.bak
# Step B: KB 전체 clean (config 포함 destructive — backup 으로 보존됨).
rm -rf /build/cache/tmp/v0.20-dogfood/
mkdir -p /build/cache/tmp/v0.20-dogfood/
# Step C: backup 에서 config restore.
cp /build/cache/tmp/v0.20-dogfood-config.toml.bak \
/build/cache/tmp/v0.20-dogfood/config.toml
# config.toml 안 [ingest.pdf].enabled = true, ollama endpoint =
# http://192.168.0.47:11434, ocr_model = qwen2.5vl:3b
# Step D: ingest.
"$RELEASE_BIN" ingest --config /build/cache/tmp/v0.20-dogfood/config.toml \
--json --force-reingest 2>&1 | tee /build/cache/tmp/v0.20-dogfood-ingest.ndjson
# Step E (optional): backup file cleanup. 다음 dogfood iteration 의 redundant
# backup 누적 방지. config 자체는 v0.20-dogfood/config.toml 가 in-place
# canonical, .bak 은 transient.
rm /build/cache/tmp/v0.20-dogfood-config.toml.bak
```
(대안 selective-delete 의 single-step path — config 보존 + 그 외 destructive):
```bash
find /build/cache/tmp/v0.20-dogfood/ -mindepth 1 -not -name 'config.toml' \
-exec rm -rf {} +
```
실 procedure 는 plain 5-step 의 명료성 우선 (executor 의 default).
- **(c) acceptance**:
- spec §6 row 3: 9 PDF 의 `skipped_size_exceeded == 0` for non-code (= 모두 0 — workspace 가 code 0).
- spec §6 row 8: F1 + F2 의 `kind != "Error"` (chunk_id collision 부재).
- spec §6 row 12: mojibake.pdf 의 ingest item `block_count: 1`.
- spec §6 row 15: 9 PDF 모두 ingest, `errors = 2` (encrypted only — pre-existing dogfood baseline 동일).
- **Ollama 미가용 시 fallback**: endpoint 가 unreachable 면 본 sub-action 의 partial skip 가능 — workspace test (E1) + clippy (E2) 의 unit/integration 수준 evidence 로 spec §6 row 1, 2, 4-7, 9-11, 13-14, 16 충족 + dogfood row 3, 8, 12, 15 skip 1줄 record (commit body + result file).
- **Acceptance**:
- ingest report 의 ndjson 안 errors = 2 (encrypted only).
- F1/F2/mojibake 각각의 item line `kind` field 가 success path (= `"new"` 또는 `"unchanged"`, not `"Error"`).
- dogfood log path: `/build/cache/tmp/v0.20-dogfood-ingest.ndjson` (commit body 안 reference).
#### Sub-action E4 — commit 점검 + 최종 organize
- **Files affected**: 모든 step 의 누적 changes.
- **Action**:
- `git status` + `git log --oneline b4d9e60..HEAD` — Step 1-4 의 4 commit + Step 5 의 verify-only commit 0 (verification 만, commit 없음).
- 만약 work-in-progress 잔존 file 있으면 reset.
- commit message 의 `Co-Authored-By:` line 점검 (CLAUDE.md gitea-pr workflow).
- **Acceptance**:
- `git log --oneline b4d9e60..HEAD | wc -l` = **4** (Step 1-4 의 각 1 commit).
- `git status` 의 untracked + modified = 0.
#### Sub-action E5 — PR #189 force-push
- **Files affected**: remote ref `gitea/feat/pdf-scanned-ocr`.
- **Action** (gitea-ops skill 의 직접 호출 가능):
```bash
git push gitea feat/pdf-scanned-ocr --force-with-lease
```
- `--force-with-lease` — local 의 fetch state 와 remote HEAD 가 match 시에만 force-push (다른 collaborator 의 push 보호; 본 single-user 환경 cheap safety).
- PR #189 의 body 갱신 — Bug #2/#3/#4 fix summary + dogfood evidence 추가 (gitea API `PATCH /repos/altair823-org/kebab/pulls/189`).
- 사용자 memory `feedback_pr_workflow` 따라 `gitea-pr-review` skill 의 review 루프 진입 (multi-round critic + verifier).
- **Acceptance**:
- `gh-equivalent` (gitea-ops `gitea-pr-status 189`) 의 head SHA = local `git rev-parse HEAD`.
- PR #189 의 commit count = 이전 force-push 시점 의 commit count + 4.
- sequencing summary 의 5-commit table (§7) 와 final state 일치.
#### Commit (Step 5)
verification only — git commit 0. Step 1-4 의 4 commit 가 final tree.
---
## §4 Verifier checklist (spec §6 16-row 1:1 mapping)
각 row 가 scriptable command. step 5 E1-E3 의 누적 실행으로 모두 가능.
| # | Verifier | Bug | step | 명령 |
|---|---------|-----|------|------|
| 1 | walker bypasses size cap for PDF | #2 | A3 | `cargo test -p kebab-source-fs size_cap_skips_only_code_files -j 4` |
| 2 | walker still skips oversized code files | #2 | A3 | `cargo test -p kebab-source-fs ingest_report_counts_oversized_files_by_bytes -j 4` |
| 3 | 256KB+ PDF/markdown ingest default config | #2 | E3 | dogfood: `$RELEASE_BIN ingest ...` 의 ingest report 의 `skipped_size_exceeded = 0` for non-code |
| 4 | chunker collision regression test | #3 | B5 | `cargo test -p kebab-chunk multi_chunk_page_with_aggressive_overlap_produces_unique_chunk_ids -j 4` |
| 5 | chunker determinism preserved | #3 | B5 | `cargo test -p kebab-chunk deterministic_chunk_ids_1000 -j 4` |
| 6 | chunker overlap clamp preserved | #3 | B5 | `cargo test -p kebab-chunk overlap_clamped_when_overlap_exceeds_target -j 4` |
| 7 | integration: multi-scanned PDF ingest (conditional, §4.5.1) | #3 | C2 | `cargo test -p kebab-app multi_scanned_pdf_ingest_no_chunk_id_collision -j 4` (Option A/B downgrade path 시 skip + record) |
| 8 | dogfood: F1 + F2 force-reingest errors=0 | #3 | E3 | dogfood: `$RELEASE_BIN ingest --force-reingest ...` 의 errors = 0 (encrypted 제외) |
| 9 | F4 fixture lopdf 1-page invariant | #4 | D3 | `cargo test -p kebab-parse-pdf mojibake_fixture_load_yields_one_page -j 4` |
| 10 | F4 fixture ToUnicode 부재 invariant | #4 | D3 | `cargo test -p kebab-parse-pdf mojibake_fixture_has_no_tounicode_cmap -j 4` |
| 11 | F4 PdfTextExtractor 1-block invariant | #4 | D3 | `cargo test -p kebab-parse-pdf pdf_text_extractor_on_mojibake_yields_one_block -j 4` |
| 12 | dogfood: F4 ingest block_count=1 | #4 | E3 | dogfood: mojibake.pdf 의 ingest item `block_count: 1` |
| 13 | workspace clippy clean | all | E2 | `cargo clippy --workspace --all-targets -j 1 -- -D warnings` |
| 14 | workspace full test pass | all | E1 | `cargo test --workspace --no-fail-fast -j 1` |
| 15 | dogfood end-to-end 9 PDF | all | E3 | dogfood: 9 PDF 모두 ingest, errors = 2 (encrypted only) |
| 16 | chunker_version cascade final value | #3 | B3 | `grep -nE 'pdf-page-v[0-9.]+' crates/kebab-chunk/src/pdf_page_v1.rs` 결과가 `"pdf-page-v1.1"` |
executor 의 final step (E1-E3) 에서 16 row 모두 scriptable 실행 + result file 안 row-by-row pass/fail/skip 기록.
#### Workspace baseline expected test count delta (verifier round 1 M-1 closure)
`cargo test --workspace -j 1` (Step 5 E1) 의 expected `test result: ok. N passed` 의 delta 산수 — pre-fix baseline 대비:
| Step | Sub-action | new test name | crate | type |
|---|---|---|---|---|
| 1 | A3 | `size_cap_skips_only_code_files` | kebab-source-fs | unit (in `mod tests`) |
| 2 | B5 | `multi_chunk_page_with_aggressive_overlap_produces_unique_chunk_ids` | kebab-chunk | unit (in `mod tests`) |
| 3 | C2 | `multi_scanned_pdf_ingest_no_chunk_id_collision` | kebab-app | integration (new test binary) |
| 4 | D3 | `mojibake_fixture_load_yields_one_page` | kebab-parse-pdf | unit-style integration |
| 4 | D3 | `mojibake_fixture_has_no_tounicode_cmap` | kebab-parse-pdf | unit-style integration |
| 4 | D3 | `pdf_text_extractor_on_mojibake_yields_one_block` | kebab-parse-pdf | unit-style integration |
- **Option A (full path, C2 active)**: total = **+6 unit/integration test cases + 1 new integration test binary**.
- **Option B (C2 conditional downgrade per §4.5.1)**: total = **+5 test cases + 0 new binary**.
기타: B3 의 `chunker_version_is_pdf_page_v1` 기존 test 의 assertion content 변경 없음 (VERSION_LABEL const 인용) → test count delta 0. D2 의 `vector_pdf_extract_byte_identical_to_baseline` 기존 test 의 assertion 결과만 변경 (fixture 변경 → baseline 변경) → test count delta 0, snapshot regen action 만 추가.
executor 가 E1 acceptance 의 N 비교 시 본 delta 산수 와 일치 확인 (regression 시 detection).
---
## §5 Risks (plan 단계)
- **R-1 (MockOcrEngine sharing complexity)**: spec §4.5.1 의 Option A (`tests/common/mock_ocr.rs` lift) 가 기존 `pdf_ocr_apply.rs:20-45` 의 9 test 의 ctor migration 필요 — backward-compat ctor 2 개 (single + per_page) 도입 시 trivial, 실패 시 Option B (inline) downgrade. spec §6 row 7 conditional skip 가능.
- **R-2 (chunker_version bump cascade scope)**: `pdf-page-v1.1` 의 영향 = multi-chunk PDF page 의 chunk_id 변경. parser_version / embedding_version / prompt_template_version / index_version unchanged — `kebab-eval::eval_runs.config_snapshot_json` 의 5-version snapshot 의 chunker_version field 만 새 값. parent design §9 의 cascade rule invariant 보존, eval baseline 의 re-run 권장 (spec §7.1 Risk 1 의 user-facing note).
- **R-3 (F4 fixture binary churn)**: pikepdf 의 save output 가 reportlab+byte-edit 와 다른 PDF object ordering → SHA256 변경 + git binary diff noise. `text_extractor_regression.rs` baseline 도 새 fixture 의 actual output 으로 same-commit update — Step 4 D2 안 동시 처리.
- **R-4 (dogfood Ollama 의존)**: spec §6 row 3 + 8 + 12 + 15 dogfood acceptance 가 real `192.168.0.47:11434` qwen2.5vl:3b 호출. endpoint 미가용 시 unit/integration evidence (row 1-2, 4-7, 9-11, 13-14, 16) 로 partial closure + commit body / result file 안 skip record.
- **R-5 (pikepdf dependency install)**: Step 4 D1 의 mojibake.py 의 `import pikepdf` — 본 머신의 Python venv 에 pip install 필요. CI 의존성 미발생 (fixture commit 후 1회성 generation).
- **R-6 (parent plan 와의 동시 진행 충돌 0 확인)**: parent plan (`2026-05-27-pdf-scanned-ocr-plan.md` round 1c ACCEPT) 의 Step 11 (final verify + PR open) 가 이미 commit `b4d9e60` 으로 closed. 본 plan 의 fix commits 가 그 commit 위에 stack — branch ordering 충돌 0.
---
## §6 Open questions deferred to executor
- **OQ-1 (MockOcrEngine sharing path)**: spec §4.5.1 의 Option A (`tests/common/mock_ocr.rs` lift) vs Option B (inline) 결정. executor 의 Step 3 C1 안 first action — probe `grep -rn "impl OcrEngine"` 후 결정 + result file 안 record. plan first preference = Option A.
- **OQ-2 (F4 baseline snapshot update tool)**: ✅ **CLOSED (round 1c, critic M-2 + verifier H-4)** — `text_extractor_regression.rs:59-64` 의 actual pattern = hand-rolled `unwrap_or_else { write baseline }` (insta crate 사용 X). regen procedure = snapshot file `tests/snapshots/vector_pdf_canonical.json` 삭제 + cargo test 2회 (1st auto-regen, 2nd verify). cargo-insta CLI 불요. detail = §3 Step 4 D2 의 Action (c).
- **OQ-3 (pikepdf install command)**: `pip install` 의 cache-dir + venv 결정 — global `--user` pip 또는 fixture generation 전용 venv 또는 conda environment. plan 의 default = `pip install --cache-dir /build/cache/pip pikepdf reportlab` (memory `feedback_disk_layout`).
- **OQ-4 (dogfood config.toml 의 endpoint 변경 시점)**: 본 dogfood 환경의 `192.168.0.47:11434` Ollama endpoint 가 변경되면 executor 가 alternative endpoint (`localhost:11434` 등) 로 override + result file 안 record.
- **OQ-5 (PR #189 review 루프의 round 수)**: memory `feedback_pr_workflow` 의 gitea-pr + 리뷰 루프 — round 1 critic + verifier 의 결과에 따라 round 2/2c 진입 가능. 본 plan 은 round 0 (drafter) — review round 의 outcome 은 plan 외 scope.
---
## §7 Sequencing summary (logical commit boundaries)
| commit # | step range | logical scope | file count |
|---:|---|---|---:|
| 1 | Step 1 (A1+A2+A3) | `fix(source-fs): apply size limit only to code files; PDF/image/markdown bypass walker cap (Bug #2)` | 2 |
| 2 | Step 2 (B1+B2+B3+B4+B5) | `fix(chunk): chunk_id collision under aggressive overlap; bump pdf-page-v1 → pdf-page-v1.1 (Bug #3)` | 4 (pdf_page_v1.rs + HOTFIXES.md + pdf_pipeline.rs:168 + :368, verifier H-1) |
| 3 | Step 3 (C1+C2) | `test(app): multi-scanned PDF chunk_id collision-free integration test (Bug #3 regression)` | **4 (Option A: existing common/mod.rs append + new common/mock_ocr.rs + modify pdf_ocr_apply.rs + new multi_scanned_pdf_ingest_no_chunk_id_collision.rs, verifier H-2)** / 1 (Option B) |
| 4 | Step 4 (D1+D2+D3) | `fix(parse-pdf): F4 mojibake.pdf via pikepdf surgery; preserve 1-page invariant (Bug #4)` | **5 (mojibake.py + fixtures/mojibake.pdf + snapshots/vector_pdf_canonical.json + text_extractor_regression.rs (D3 append) + src/text_quality.rs:96 consumer verify, verifier H-4 + H-3 + NIT-2)** |
| 5 | Step 5 (E1-E5) | verification only — git commit 0; final state = commits 1-4 위 PR #189 force-push | 0 |
총 4 commit + 1 verify-only step. force-push 후 PR #189 의 head = local HEAD.
---
## §8 Round 1c rewrite changelog (drafter trace)
round 1 critic + verifier 의 합산 21 finding (critic 7 + verifier 14) 적용. detail 은 result file (`.omc/reviews/2026-05-27-v0.20-bugfix-plan-drafter-r1c-result.md`) 의 §1 traceability matrix 참조. 본 §8 은 plan body 의 substantive change summary.
### Critic r1 (7 finding)
| ID | Severity | Action | Plan section |
|---|---|---|---|
| critic H-1 | HIGH | E3 dogfood config 의 backup 후 clean + restore 5-step procedure (외부 backup file 부재 reality 반영) | §3 Step 5 E3 (b) |
| critic M-1 | MEDIUM | line 15 "17 sub-action" → "18 sub-action" | §0 prelude line |
| critic M-2 | MEDIUM | D2 snapshot baseline 갱신 mechanic 명문 (hand-rolled `unwrap_or_else` pattern, OQ-2 closure) | §3 Step 4 D2 + §6 OQ-2 |
| critic L-1 | LOW | B1 line range "200-289" → "200-204 (doc) + 205-289 (body)" 명시 | §3 Step 2 B1 |
| critic L-2 | LOW | MockOcrEngine ctor count "9 test (existing)" → "10 instantiation site" (actual probe) | §3 Step 3 C1 + C2 |
| critic L-3 | LOW | D1 pre-action 에 DejaVuSans.ttf existence probe 1줄 추가 | §3 Step 4 D1 |
| critic NIT-1 | NIT | "5 logical commit" → "4 commit + 1 verify-only step (= 5 step total, 4 commit boundary)" | §0 prelude line |
### Verifier r1 (14 finding)
| ID | Severity | Action | Plan section |
|---|---|---|---|
| verifier H-1 | HIGH | B3 sub-action 에 `pdf_pipeline.rs:168` (hard assertion) + `:368` (error message) literal 갱신 명시 + acceptance grep regex 정밀화 (`grep -v 'pdf-page-v1\.1'`) | §3 Step 2 B3 |
| verifier H-2 | HIGH | Step 3 Option A 의 `common/mod.rs` 가 existing infrastructure 반영 — `pub mod mock_ocr;` 1줄 append + 신규 `common/mock_ocr.rs` + `pdf_ocr_apply.rs` lift + 신규 integration test = 4 file edit | §3 Step 3 C1 + C2 + §7 commit 3 file count |
| verifier H-3 | HIGH | D3 file path `text_extractor.rs` 부재 정정 → `text_extractor_regression.rs` append (locality with D2 snapshot regen). `include_bytes!` path 도 `../tests/fixtures/...` → `fixtures/...` 직접 + CWD-relative `std::fs::read` 회피 | §3 Step 4 D3 |
| verifier H-4 | HIGH | D2 snapshot regen mechanic — snapshot file `tests/snapshots/vector_pdf_canonical.json` 삭제 + cargo test 2회 (1st auto-regen, 2nd verify) + `src/text_quality.rs:96` 2번째 consumer enumerate | §3 Step 4 D2 + §7 commit 4 file count |
| verifier M-1 | MEDIUM | §4 verifier checklist 뒤에 expected workspace test count delta 산수 표 추가 (+6 unit + 1 integration, Option A / +5 + 0, Option B) | §4 (sub-section) |
| verifier M-2 | MEDIUM | B2 acceptance phrasing 갱신 — "Step 2 commit time" 명시 + sub-action 별 grep 시점 명문 | §3 Step 2 B2 acceptance |
| verifier M-3 | MEDIUM | C2 Option A 의 "기존 10 ctor site mechanical migration" 명령 명시 | §3 Step 3 C1 (b) |
| verifier L-1 | LOW | pdf_page_v1.rs line range 200-289 → 205-289 (critic L-1 와 same edit pass) | §3 Step 2 B1 |
| verifier L-2 | LOW | caller line range 155-185 → 149-186 | §3 Step 2 B2 |
| verifier L-3 | LOW | B5 test scenario comment 의 target=1500 byte + overlap=240 byte 산수 보강 | §3 Step 2 B5 |
| verifier L-4 | LOW | `ingest_pdf_ocr_smoke.rs` 의 grep B3 scope safety 확인 (별도 action 0, finding 자체 = no action) | (verified safe) |
| verifier NIT-1 | NIT | E1 의 `df -h` unit 처리 산수 정밀화 → `df -BG --output=avail` 으로 GB unit 강제 | §3 Step 5 E1 (a) |
| verifier NIT-2 | NIT | Step 4 commit scope `test-fixture` → `parse-pdf` (crate name) | §3 Step 4 commit |
| verifier NIT-3 | NIT | dogfood config canonical path 의 single-definition (in §0) + 모든 acceptance command 참조 | §0 pre-flight + §3 Step 5 E3 |
### Summary
- frontmatter `status` `draft (round 0)` → `draft (round 1c)`.
- frontmatter `review_history` 에 round 1 critic + verifier + round 1c rewrite 항목 3 줄 add.
- plan body line 15 의 prelude statement 2 token 정정 (sub-action count + commit boundary 표현).
- §0 pre-flight 에 dogfood KB layout 가정 1 bullet add.
- §3 5 step 의 sub-action body 의 detail 보강 (file path / acceptance grep / mechanic / migration cost).
- §4 verifier checklist 의 expected test count delta sub-section add.
- §6 OQ-2 closure 표시 (✅ CLOSED).
- §7 sequencing summary 의 file count 갱신 (commit 2: 2→4, commit 3: 3-4→4, commit 4: 3-4→5).
- §8 round 1c rewrite changelog (본 단락) populate.
총 plan body line 변경 = ~+250 net add (round 0 698 line → round 1c ~950 line).