altair823
241ded59df
test(app): multi-scanned PDF chunk_id collision-free integration test (Bug #3 regression)
v0.20.0 sub-item 1 bugfix Step 3 (Group C) — integration-level regression
for Bug #3 (intra-doc chunk_id collision under aggressive overlap).
- `crates/kebab-app/tests/common/mod.rs`: `pub mod mock_ocr;` 1 line append.
- `crates/kebab-app/tests/common/mock_ocr.rs` (new): MockOcrEngine lift +
`single` / `per_page` ctor (backward-compat single + per-page cursor).
- `crates/kebab-app/tests/pdf_ocr_apply.rs`: inline MockOcrEngine 제거 +
`mod common; use common::mock_ocr::MockOcrEngine;` import. 10 ctor call
site migration (`MockOcrEngine { .. }` → `MockOcrEngine::single(...)`).
- `crates/kebab-app/tests/multi_scanned_pdf_ingest_no_chunk_id_collision.rs`
(new): F1 + F2 scanned PDF + Bug #3 trigger shape (10 char "가" + ". " +
500 char "나") via mock OCR. assertion: chunk_id global uniqueness (HashSet
dedup) across F1 + F2; F2 trigger text produces ≥2 chunks (collision shape).
- C1 decision: Option A (share via tests/common/mock_ocr.rs). Facade mock
injection unavailable (OllamaVisionOcr hardcoded) — helper-level chain test
(apply_ocr_to_pdf_pages → PdfPageV1Chunker) adds value beyond unit B5.
spec: docs/superpowers/specs/2026-05-27-v0.20-sub1-bugfix-spec.md (§4.5)
plan: docs/superpowers/plans/2026-05-27-v0.20-sub1-bugfix-plan.md (Step 3)
prior: 436fd01 (Step 2 Bug #3 chunk_id fix)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 13:45:38 +00:00
..
2026-05-27 08:51:51 +00:00
2026-05-27 13:45:38 +00:00
2026-05-27 08:18:34 +00:00