altair823
436fd015a2
fix(chunk): chunk_id collision under aggressive overlap; bump pdf-page-v1 → pdf-page-v1.1 (Bug #3)
v0.20.0 sub-item 1 dogfood report 의 Bug #3 (Critical). scanned_page2.pdf
(1580 char OCR text) ingest 시 `chunks.chunk_id` PRIMARY KEY violation —
`per_chunk_hash = #c{char_start}` 가 post-overlap `actual_start` 사용 +
overlap walk floor 가 `prev_min` 으로 collapse → segment 1/2 동일 `#c0`.
- `crates/kebab-chunk/src/pdf_page_v1.rs`: `chunk_page` returns 4-tuple
(segment_start, actual_start, chunk_end, slice); caller `per_chunk_hash`
suffix uses `segment_start` (pre-overlap boundary, strictly increasing)
instead of `char_start` (post-overlap, may collapse to prev_min).
- VERSION_LABEL `"pdf-page-v1"` → `"pdf-page-v1.1"` (design §9 cascade,
explicit user-facing audit trail). `crates/kebab-app/tests/pdf_pipeline.rs:
168, 368` 의 hardcoded literal 도 v1.1 로 갱신.
- module docs (`pdf_page_v1.rs:47-60`): workaround description 의
`#c{char_start}` reference 를 `#c{segment_start}` 로 갱신 + segment_start
invariant 명문 + HOTFIXES.md cross-ref.
- `pdf_page_v1.rs::tests`: `multi_chunk_page_with_aggressive_overlap_produces_unique_chunk_ids`
regression pin (10 char "가" + ". " + 500 char "나" — multi-chunk +
overlap walk collapse trigger).
- `tasks/HOTFIXES.md`: 2026-05-27 entry (symptom F2 1580 char OCR,
intra-doc collision root cause, second-iteration patch rationale).
spec: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix-spec.md (§4)
plan: docs/superpowers/plans/2026-05-27-v0.20-sub1-bugfix-plan.md (Step 2)
prior: d9acda5 (Step 1 Bug #2 walker fix)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 13:32:09 +00:00
..
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 07:37:56 +00:00
2026-05-02 09:28:06 +00:00
2026-05-02 04:01:55 +00:00
2026-05-10 23:47:47 +09:00
2026-05-24 20:32:36 +00:00
2026-05-02 04:01:55 +00:00
2026-05-27 13:32:09 +00:00
2026-05-26 15:00:59 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00
2026-05-02 04:01:55 +00:00