# v0.20.0 — scanned PDF OCR via Ollama vision LLM v0.20.0 의 핵심 변경 = embedded text 가 없는 scanned PDF (책 스캔, 영수증, 카메라 page) 의 OCR ingest. PoC 의 5 engine 비교 (Tesseract / EasyOCR / PaddleOCR / gemma4:e4b / qwen2.5vl:3b) 에서 qwen2.5vl:3b 의 alnum 94.79% (page1) / 81.56% (받침) 가 모든 다른 engine 을 능가 — 본 release 의 default vision OCR. ## 1. OCR opt-in 사용법 `[pdf.ocr]` config 의 `enabled = true` 또는 `KEBAB_PDF_OCR_ENABLED=true` env 로 활성화. default off — OCR 한 page 당 45-100s (qwen2.5vl:3b on CPU, remote Ollama) 의 cost 가 책 archive 외 비-OCR KB 에 부적합. ```toml [pdf.ocr] enabled = true model = "qwen2.5vl:3b" # 다른 default 는 README 참조 ``` qwen2.5vl:3b 의 Ollama pull: ```bash ollama pull qwen2.5vl:3b # 3GB Ollama image ``` ## 2. v0.19 indexed scanned PDF 의 force-reingest v0.19 binary 로 scanned PDF 를 ingest 한 KB 는 자동으로 OCR path 진입 안 함 — parser_version "pdf-text-v1" 보존 (CLAUDE.md §Versioning cascade 의 trigger 회피 결정, H-4). 따라서 v0.20 binary upgrade + config `pdf.ocr.enabled = true` 만 적용 시 try_skip_unchanged 의 Unchanged path 가 OCR 실행을 skip. 명시적 재처리: ```bash kebab ingest --root /path/to/kb --force ``` ## 3. DCTDecode-only v1 scope (FlateDecode / CCITTFax page 처리) v0.20.0 의 PDF page image extract = lopdf 의 image XObject 의 /Filter == DCTDecode 만 cover (JPEG passthrough). 다른 encoding (FlateDecode raw pixel, CCITTFaxDecode bilevel, JPXDecode JPEG2000) 은 warning event 발행 + 해당 page skip. scanned PDF 의 일부 page 가 FlateDecode 또는 CCITTFax 로 encoded 시: ```bash qpdf --object-streams=disable --recompress-flate input.pdf normalized.pdf ``` v1 의 의도 = single binary 원칙 (image crate 도입 0). v1.1+ 또는 별 sub-item 에서 multi-filter 지원 검토. ## 4. Family asymmetry (image OCR gemma4:e4b vs PDF OCR qwen2.5vl:3b) image OCR (P6) 의 default 는 gemma4:e4b 그대로 (변경 0). PDF OCR (v0.20) 만 qwen2.5vl:3b. 사용자가 [image.ocr] model = "qwen2.5vl:3b" 으로 통일 가능 단 default 는 family asymmetric 보존. ## Dogfood + test 결과 - workspace test: 178 result lines, 0 failure. - workspace clippy (-D warnings): exit 0. - alnum e2e (real Ollama, manual invoke): - F1 (한국어 page1): 94.79% (≥ 0.85 threshold). - F2 (받침-intensive): 81.56% (≥ 0.70 threshold). - integration smoke + vector PDF regression: pass. ## 변경된 surface - new config: [pdf.ocr] (11 field) + 11 env override KEBAB_PDF_OCR_*. - new wire: IngestEvent::PdfOcrStarted/Finished (additive minor). - new wire: IngestItem.pdf_ocr_pages/ms_total (additive minor). - new CLI line: "📷 OCR page N..." / "✓ OCR page N (chars chars, msms via ollama-vision)". - new module: kebab-parse-pdf::{page_image, text_quality} + kebab-app::pdf_ocr_apply. - dep: workspace lopdf = "0.32" 통합. - fixture: 5 PDF (F1/F2/F4/F6/F7) under crates/kebab-parse-pdf/tests/fixtures/. ## 변경되지 않은 surface (invariant) - Extractor::extract trait body byte-identical (PR #187). - PdfTextExtractor body 변경 0 — post-extract enrichment pattern 으로 분리. - parser_version "pdf-text-v1" 보존. - chunker_version "pdf-page-v1" 보존. - workspace.dependencies 의 production dep graph 변경 0 (-e normal baseline 보존). ## sub-item 의 11 commit history9d7faabStep 1: foundation + cargo tree baselinesaeeff36Step 2: lopdf /Filter probe + 5 fixture commit (F1/F2/F4/F6/F7)fb3952dStep 2 fix: F7 conversion engine record correctionc2cd3a7Step 3: page_image + text_quality modules (10 test)8d81bc1Step 3 fix: clippy pedantic in page_image9f003efStep 4: pdf_ocr_apply helper (10 test, F7 split + cancel)fd918a6Step 5: [pdf.ocr] config section + PdfOcrOpts doc4672cbaStep 5 fix: clippy::bool_assert_comparison in pdf_ocr testsb9ee09fStep 6: wire PDF OCR enrichment + cancel propagation4c5ccd5Step 7: wire schema additive — IngestEvent + IngestItem + skippedc9e0594Step 8: CLI printer activation + ingest_progress test + spec literal4819768Step 9: integration smoke + vector regression + alnum e2e1d4e301Step 9 follow-up: Cargo.lock for dev-dep additions90726abStep 10: docs sync (README + HANDOFF + ARCHITECTURE + SMOKE) ## § Acceptance §9 verifier evidence K5 의 15 row scriptable verifier 모두 green (또는 manual real-Ollama row 의 결과 보고): - Row #4 (vector PDF byte-identical): pass. - Row #5 (Extractor::extract trait byte-identical): 0 line diff. - Row #6 (wire schema additive): jq + diff exit 0. - Row #7-#8 (clippy / workspace test): exit 0. - Row #9-#10 (dep graph baseline -e normal): empty diff. - Row #11 (docs sync): grep evidence. - Row #12 (version bump): "0.20.0" + Cargo.lock cascade ≥ 22. - Row #14 (PR #187 invariant): extract_for(&asset.media_type) ≥ 1. - Row #15 (DCTDecode-only v1, F6/F7 skip): test green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
21 KiB
title, created, status, target_version, related_specs, related_plans, related_poc, prior_handoff
| title | created | status | target_version | related_specs | related_plans | related_poc | prior_handoff | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| v0.20.0 sub-item 1 — PDF scanned OCR Phase C (executor) handoff | 2026-05-27 | ready-for-executor | 0.20.0 |
|
|
|
|
v0.20.0 sub-item 1 executor handoff
본 문서는 새 Claude session 에 v0.20.0 sub-item 1 (PDF scanned OCR) 의 Phase C (executor) 시작을 인계하는 self-contained context. Phase A (spec) + Phase B (plan) 가 ACCEPT 된 시점의 상태.
1. 컨텍스트 요약
1.1 v0.20 sub-item 1 의 phase 진행 상태
본 session 에서 완료된 work:
| Phase | Step | Model | Latency | Verdict |
|---|---|---|---|---|
| PoC | 한국어 OCR engine 비교 (Tesseract / EasyOCR / PaddleOCR / gemma4 / qwen2.5vl) | direct main | ~1.5 hour | qwen2.5vl:3b 채택 |
| Phase A | spec drafter round 0 (opus) | opus | ~10 min | 1127 lines draft |
| Phase A | spec critic round 1 thorough | opus | ~8 min | NEEDS_DISCUSSION HIGH 5 |
| Phase A | spec drafter round 1c rewrite | opus | ~14 min | resolved |
| Phase A | spec critic round 2 closure | opus | ~5 min | ACCEPT |
| Phase B | plan drafter round 0 | opus | ~17 min | 890 lines draft |
| Phase B | plan critic round 1 thorough | opus | ~8 min | NEEDS_DISCUSSION MEDIUM 4 |
| Phase B | plan verifier round 1 | opus | ~8 min | NEEDS_DISCUSSION HIGH 5 + MEDIUM 10 |
| Phase B | plan drafter round 1c rewrite | opus | ~15 min | resolved 19 finding |
| Phase B | plan round 2 closure (critic + verifier 통합) | opus | ~8 min | ACCEPT |
| Phase C | executor | opus | pending | ← 새 session 시작 |
1.2 본 sub-item 의 user-resolved 결정 5개
PoC 단계에서 user 가 확정한 결정 (변경 불가):
- PDF page rendering =
pdfium-render보류, lopdf 의 image XObject stream 으로 진행 (single binary 원칙). PoC 의 architecture 재검토 단계에서 사용자가 "rust 바이너리 하나로 모든 기능" 명시 → Tesseract / pdfium 같은 native dep 회피. - OCR engine = qwen2.5vl:3b via Ollama HTTP API. PoC 결과 page1 alnum 94.79% / 받침 alnum 81.56% — Tesseract (87% / 67%) / EasyOCR (90% / 74%) / gemma4:e4b (77% / 27%) 모두 능가.
- Default behavior = text-detect first + vision LLM fallback. (사용자 1차 always-on 결정 후 latency 측정 → reverse).
pdf.ocr.always_on = trueconfig 으로 override 가능. - 한국어 OCR mitigation = vision LLM 자체로 충분 (Tesseract 의 받침 약점 / preprocessing tuning 회피).
- 테스트 + 튜닝 명문 = 합성 fixture (F1/F2 일반 + F3 vector + F4 mojibake + F6 FlateDecode + F7 CCITTFax) + 실측 신문 PoC (Metro Korea archive.org) + alnum metric 측정 (
strsimLevenshtein +#[ignore]ocr_e2e test).
1.3 PoC 의 핵심 발견 (architecture 결정 driver)
docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md 참조. 짧게:
- Tesseract = single binary 원칙 위반 (libtesseract + libleptonica + tessdata 외부 dep). 받침 alnum 67% — 약점 명확.
- EasyOCR = PyTorch sidecar (~850MB) — 더 안 됨.
- PaddleOCR = PaddlePaddle 3.0+ PIR/oneDNN runtime bug. 환경 우회 불가.
- gemma4:e4b vision = paraphrase / hallucination. 받침 fixture 27% — Tesseract 보다 -40%p.
- qwen2.5vl:3b = 한국어 vision OCR transcription bias. 신문 real-world test 도 본문 거의 perfect. 받침 alnum 81.56%, page1 94.79%.
architecture 정합:
- kebab core principle (CLAUDE.md): "single binary
kebab(LLM 제외)". - user memory
project_llm_default= "OCR/caption 와 family 통일" — Ollama 가 single inference source 라서 자연 정합. - 단점 = latency (qwen2.5vl 45-105s/page). mitigation = text-detect first + vision fallback. 800-page 책 의 평균 cost ↓.
2. 현재 branch + uncommitted state
2.1 branch
feat/pdf-scanned-ocr (main HEAD = bcd1e37 기반)
main HEAD bcd1e37 = chore(repo): .omc/ ignore + AGENTS·GEMINI symlinks + release notes 작성 가이드 강화 (이 session 의 첫 commit).
2.2 uncommitted file (모두 신규)
docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md (PoC baseline)
docs/superpowers/specs/2026-05-27-pdf-scanned-ocr-spec.md (Phase A 최종, 1719 lines, ACCEPT)
docs/superpowers/plans/2026-05-27-pdf-scanned-ocr-plan.md (Phase B 최종, ~1100 lines, ACCEPT)
docs/superpowers/handoffs/2026-05-27-v0.20-sub1-executor-handoff.md (본 문서)
.omc/reviews/ 안의 brief + result file 들 (pdf-ocr-spec-critic-r1-result.md / pdf-ocr-plan-round-2-closure-result.md 등) 은 .gitignore 의 .omc/ 패턴으로 자동 무시.
PoC scratch sandbox = /build/cache/pdf-ocr-poc/ (루트 디스크 보호 — 사용자 머신 정책). commit 대상 0.
2.3 commit 전략 (plan §7 + 사용자 memory)
- plan §7 의 11 logical commit (per-step) 패턴.
- user memory
feedback_pr_workflow(gitea-pr + 리뷰 루프) — 모든 step 완료 후 단일 PR, 본 session 의 sub-item 1/2/3 패턴 그대로. - K6 = Step 11 의 version bump (
workspace.version 0.19.0 → 0.20.0) + final verify + PR open 의 마지막 commit.
3. Phase C (executor) 의 시작 지침
3.1 핵심 reference 3 file
새 session 의 read 순서:
- 본 handoff (
docs/superpowers/handoffs/2026-05-27-v0.20-sub1-executor-handoff.md) — 전체 context. - plan (
docs/superpowers/plans/2026-05-27-pdf-scanned-ocr-plan.md) — 11 step / 34 sub-action / 4-component (file path + RED test + GREEN impl + acceptance command) detail. - spec (
docs/superpowers/specs/2026-05-27-pdf-scanned-ocr-spec.md) — 변경 결정의 ground truth (frozen contract).
추가 reference:
- PoC (
docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md) — quality / latency baseline + architecture rationale. - prior handoff (
docs/superpowers/handoffs/2026-05-26-v0.20-image-pdf-normalize-handoff.md) — v0.20 전체 context + sub-item 2/3 의 머지된 결과. - CLAUDE.md +
docs/ARCHITECTURE.md— workspace 규칙.
3.2 executor 의 11 step 진행 패턴
plan §2 의 step group structure (A-K, 34 sub-action). 각 step 의 4-component (file path + RED test + GREEN impl + acceptance + commit msg draft) 가 plan §3 안에 있음.
권장 순서 (plan §1 ordering invariant):
- Step 1 (Group A) — Foundation + dep + baseline — spec L-1 cosmetic fix + Cargo.toml dep +
.omc/state/pdf-ocr-*-deps.baseline.txt캡처. - Step 2 (Group B) — lopdf probe + fixture commit — F1/F2/F4/F6/F7 모두 commit (Phase B verifier H-4 resolution: 모든 fixture commit 이 Step 2 안). 이후 Step 3-4 test 가 commit 시점부터 GREEN-able.
- Step 3 (Group C) — page_image + text_quality —
extract_dctdecode_page_image+compute_valid_char_ratio. RED→GREEN cycle. - Step 4 (Group D) — pdf_ocr_apply helper —
kebab-app::pdf_ocr_apply의apply_ocr_to_pdf_pages(&mut canonical, &dyn OcrEngine, &bytes, &opts). 9 integration test. - Step 5 (Group F) — Config schema —
pdf.ocr.*(image.ocr 패턴 mirror). - Step 6 (Group E) — Ingest wiring — eager init + signature update + post-extract enrichment 호출 + E4 cancel handle propagation (round 1 critic 의 finding resolution).
- Step 7 (Group G) — Wire schema additive —
IngestEvent::PdfOcrStarted/PdfOcrFinished+IngestItem.pdf_ocr_pages/pdf_ocr_ms_total. JSON Schema 갱신. - Step 8 (Group H) — In-tree consumer — CLI printer + snapshot regenerate (cargo insta).
- Step 9 (Group I) — Integration smoke + regression + ocr_e2e —
ingest_pdf_ocr_smoke.rs(ingest + search + cancel 3 step) +text_extractor_regression.rs(vector PDF byte-identical) +ocr_e2e.rs(#[ignore]alnum accuracy test). - Step 10 (Group J) — Docs sync — README + HANDOFF + ARCHITECTURE + SMOKE + RELEASE_NOTES (J0 pre-flight 가 path 결정).
- Step 11 (Group K) — Version bump + final verify + PR — K1 (Cargo.toml 0.19→0.20) + K2 (release build,
$RELEASE_BINalias) + K3 (workspace test, -j 1) + K4 (clippy) + K5 (§ Acceptance §9 15-row mapping) + K6 (commit + PR via gitea-pr).
3.3 executor 의 핵심 invariant (plan §1 + §9)
- Extractor trait byte-identical —
crates/kebab-core/src/traits.rs의 변경 0 (acceptance row #5). - PR #187 polymorphic dispatch 보존 —
app.extract_for(&asset.media_type, &ctx, &bytes)(line 1778) 유지 — Step 6 E3 가 그 직후 post-extract enrichment 호출 (acceptance row #14). - single binary 원칙 —
imagecrate /pdfium-render/libtesseract도입 0. lopdf + base64 + reqwest 만 (acceptance row #9). - vector PDF byte-identical — F3 fixture 의 결과가 plan 변경 적용 전후 byte-identical (acceptance row #4).
- text PDF 의 OCR 미작동 —
pdf.ocr.enabled = falsedefault +parser_version "pdf-text-v1"유지 (v0.19 dogfood KB 의 try_skip_unchanged path 미변경).
3.4 잔존 risk (plan §5)
R-1 ~ R-10 (plan §5 참조):
- R-1 F1/F2 fixture 가 DCTDecode 가 아닌 경우 — Step 2 probe 후 결정 분기 + 재합성.
- R-2 F4 mojibake fixture 합성 reliability — fallback chain (reportlab → fpdf2 → lopdf 수작업). 최후 row-skip downgrade.
- R-3 sub-item 2 의
normalize_provenance_timestampshelper 위치 — PR #186 머지 후 확인. - R-4
IngestEventenum location (crates/kebab-app/src/ingest_progress.rs:58) — plan G1 이 명시. - R-7 Ollama host (
http://192.168.0.47:11434) 의 qwen2.5vl:3b availability — MockOcrEngine 가 integration smoke 자동 cover, dogfood smoke 만 real Ollama 호출. - R-9 -j 1 workspace test 시간 (15-30 min) — K3 / K4 sequential 명문 (background 금지).
3.5 잔존 open question (plan §6)
OQ-E1 ~ OQ-E10 (plan §6) 모두 executor 의 first sub-action 의 명시적 deliverable:
- OQ-E6 IngestEvent enum 의 serde discriminant 정합 (Step 7.1 first sub-action).
- OQ-E7 RELEASE_NOTES.md path 결정 (Step 10 J0 pre-flight).
- OQ-E10 ingest dispatch loop 의 pdf_ocr_engine + cancel propagation (Step 6.2 first sub-action).
4. OMC workflow pattern (사용자 memory)
새 session 이 따라야 할 핵심 memory feedback:
4.1 feedback-teammate-spawn-mode — omc-teams tmux pane
agent / teammate spawn 시 in-process Agent tool 대신 /oh-my-claudecode:omc-teams 통해 별 tmux pane 으로 spawn. 사용자가 worker 진행 실시간 모니터.
4.2 feedback-omc-teams-usage — 4.14.1 의 정확한 사용 방법
본 환경의 known limitation: one_team_per_leader_session = true (default). leader session 당 active team 하나만. config override path 미발견 — sequential single-team workflow.
spawn command:
omc team 1:claude[:role] --no-decompose "task description (first 30 char = team slug)"
brief pattern (self-contained file):
- brief file write:
.omc/reviews/<date>-<task-id>-brief.md - spawn command 의 task description 짧게:
"Task X. Read <brief-file-absolute-path> and execute as specified. Output result to <result-file-absolute-path>" - worker → result file → main session 이 read.
shutdown: omc team shutdown <slug> --force (non-force 가 cli.cjs error 종종 fail).
4.3 feedback-worker-completion-polling — 자동 completion detection
worker spawn 직후 즉시 background polling shell 시작 — run_in_background=true 시 task notification 으로 main session 자동 알림:
while true; do
phase=$(omc team status <team-slug> 2>/dev/null | grep "team=" | head -1 | grep -oE "phase=[a-z]+" | cut -d= -f2)
if [[ "$phase" == "completed" || "$phase" == "failed" ]]; then
echo "TEAM_DONE: phase=$phase team=<team-slug>"
break
fi
sleep 20
done
4.4 feedback-teammate-model-routing — opus/sonnet 정책 (미적용)
명시 정책: executor + initial draft + round 1 review = opus, closure verify / micro-patch = sonnet. 실제: omc team CLI 의 model flag 없음, worker default = opus, sonnet 강제 path 미발견. 본 sub-item 1 은 모든 worker = opus 사용. 미래 OMC 버전에서 model flag 지원 시 재시도.
4.5 feedback-user-review-gates — skip 정책
brainstorming + writing-plans 의 사용자 confirm gate skip. spec / plan 의 self-review 만 + 바로 다음 단계. 핵심 trade-off 결정만 AskUserQuestion.
4.6 feedback-serial-build-only — cargo 직렬
cargo build / test / clippy 동시 background 금지. 직렬 진행. -j 4 default, -j 1 (workspace test/clippy) 의 메모리 link 충돌 방지.
4.7 feedback-pr-workflow — gitea-pr + 리뷰 루프
모든 task 의 default workflow. Step 11 K6 의 PR open path = gitea-pr --title ... --head feat/pdf-scanned-ocr --base main --body "$(cat <<EOF ... EOF)".
4.8 feedback-readme-sync-rule
사용자 visible surface 변경 (config schema, CLI flag, wire schema) 시 implementation PR 이 같은 PR 에서 README + HANDOFF + docs/ARCHITECTURE 세 문서 동시 갱신. Step 10 J 가 이 deliverable.
4.9 feedback-no-caveman
caveman 말투 사용 금지. 모든 응답은 자연스러운 한국어 산문.
5. Phase C 의 첫 단계 (새 session 의 entry point)
# 1. branch state 확인
git status
git log --oneline -5
git branch -vv | grep feat/pdf-scanned-ocr
# 2. uncommitted 파일 확인 (모두 신규)
ls docs/superpowers/{poc,specs,plans,handoffs}/2026-05-27-*
# 3. plan + spec read (executor 가 따를 ground truth)
cat docs/superpowers/plans/2026-05-27-pdf-scanned-ocr-plan.md | head -100
cat docs/superpowers/specs/2026-05-27-pdf-scanned-ocr-spec.md | head -100
# 4. PoC scratch 의 fixture / scripts 확인 (Phase C Step 2 의 source)
ls /build/cache/pdf-ocr-poc/{fixtures,ground-truth,scripts,images,ocr-out}/
# 5. memory 확인
cat ~/.claude/projects/-home-altair823-kebab/memory/MEMORY.md
그 후 사용자에게 묻기:
- "Phase C executor 시작 — Step 1 (Foundation + dep + baseline) 부터 진행하시겠습니까?"
- 또는 "executor 를 omc-teams worker 로 분리해서 진행 (sequential single-team) vs main session 에서 직접 진행?"
5.1 executor spawn pattern (권장)
memory feedback-teammate-spawn-mode 따라 omc-teams 통해 별 worker spawn. brief 의 핵심:
# executor brief — v0.20.0 sub-item 1 Step N
read: plan §3 Step N + spec §M (변경 결정의 ground truth) + 현재 코드 baseline.
follow: TDD (RED test → GREEN impl) + plan 의 4-component acceptance + commit msg draft.
single commit per step (plan §7 11-commit table).
report: `.omc/reviews/2026-05-27-pdf-ocr-executor-step-N-report.md`.
step 별 worker spawn — single-team 제약으로 한 번에 한 step. step 완료 후 shutdown + 다음 step spawn.
5.2 main session direct execution (alternative)
대안: omc-teams 의 latency / overhead 없이 main session 이 직접 11 step 진행. small step 의 경우 더 효율. 큰 step (예: Step 4 의 9 test) 만 worker 위임.
사용자 선호에 따라 결정.
6. release 계획
본 sub-item 1 완료 후:
- workspace.version 0.19.0 → 0.20.0 (Step 11 K1).
- v0.20.0 release notes (Step 10 J4) — CLAUDE.md §Release 절차 의 친절한 설명 rule 준수 (commit subject 단순 나열 NOT, 사용자가 이해할 수 있도록).
gitea-release v0.20.0(Step 11 K6 후 또는 별).- sub-item 2 (TODO #2 multi-region image) / sub-item 3 (TODO #3 PDF normalize) / sub-item 4 (TODO #4 PDF figure/table) 는 별 sub-item — v0.20 의 후속 sub-item 또는 v0.21.
7. 새 session 의 검증 invariant
executor 완료 후 PR open 전 검증 (plan §9 의 15 acceptance row):
- workspace test 회귀 0 (baseline + 27~28 new test).
- wire schema additive only (jq + diff 검증).
- design contract 변경 0 또는 frozen task spec 동시 갱신.
- workspace.version minor bump (0.19 → 0.20).
- dogfood smoke 6 step green (v0.19 → v0.20 force-reingest 시나리오 포함).
cargo clippy --workspace --all-targets -j 1 -- -D warningsclean.cargo build --release -p kebab-cli -j 4clean.
8. 본 handoff 의 의도
본 file 만 read 하면 새 Claude session 이 Phase C executor 시작 가능. self-contained — 본 main session 의 모든 context 보존.
cat /home/altair823/kebab/docs/superpowers/handoffs/2026-05-27-v0.20-sub1-executor-handoff.md
후 plan + spec + PoC 순으로 read → 사용자에게 첫 step 진행 확인 → Step 1 부터 진행.
9. 핵심 파일 / 경로 reference
본 sub-item 의 작업 산출물 (uncommitted)
docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md— PoC baseline (qwen2.5vl 채택 evidence)docs/superpowers/specs/2026-05-27-pdf-scanned-ocr-spec.md— frozen spec (ACCEPT)docs/superpowers/plans/2026-05-27-pdf-scanned-ocr-plan.md— executor follow-able plan (ACCEPT)docs/superpowers/handoffs/2026-05-27-v0.20-sub1-executor-handoff.md— 본 handoff
Phase C 변경 대상 코드 (plan 의 file path 참조)
crates/kebab-parse-pdf/src/lib.rs— PdfTextExtractor body 변경 0 (invariant), 새 module export (page_image,text_quality).crates/kebab-parse-pdf/src/page_image.rs(신규) —extract_dctdecode_page_image.crates/kebab-parse-pdf/src/text_quality.rs(신규) —compute_valid_char_ratio.crates/kebab-parse-pdf/Cargo.toml— dep 추가 (kebab-parse-imageparser cross 회피 위해 NOT 추가, OcrEngine trait 의 carry path 가 caller).crates/kebab-app/src/pdf_ocr.rs(신규) —apply_ocr_to_pdf_pages(post-extract enrichment helper).crates/kebab-app/src/lib.rs— line 1696-1850 ingest_one_pdf_asset wiring (line 1778 의app.extract_for(...)직후 post-extract 호출), line 338-347 image OCR build pattern mirror.crates/kebab-app/src/ingest_progress.rsline 58 —IngestEventenum 의 새 variant.crates/kebab-config/src/lib.rs—pdf.ocr.*config (image.ocr 패턴 mirror).crates/kebab-cli/src/main.rs— ingest stdout printer 의 새 event kind mapping.
Phase C 신규 test fixtures (plan Step 2 B2)
crates/kebab-parse-pdf/tests/fixtures/scanned_page1.pdf(F1, DCTDecode JPEG-wrapped)crates/kebab-parse-pdf/tests/fixtures/scanned_page2.pdf(F2, 받침-intensive DCTDecode)crates/kebab-parse-pdf/tests/fixtures/mojibake.pdf(F4, custom font no ToUnicode CMap)crates/kebab-parse-pdf/tests/fixtures/flate_raw.pdf(F6, FlateDecode skip path)crates/kebab-parse-pdf/tests/fixtures/ccittfax.pdf(F7, CCITTFax skip path)tests/fixtures/_synth/mojibake.py(F4 합성 script — reproducible)tests/fixtures/_synth/flate_ccittfax.sh(F6/F7 합성 script)
PoC scratch (commit 대상 0, reference only)
/build/cache/pdf-ocr-poc/fixtures/(page1-clean.png, page2-clean.png 등)/build/cache/pdf-ocr-poc/ground-truth/(page1.txt, page2-batchim.txt)/build/cache/pdf-ocr-poc/scripts/(make_image.py, scanned_sim.py, preprocess.py, compare.py, vision_ocr.py, etc.)/build/cache/pdf-ocr-poc/ocr-out/(engine 별 result)/build/cache/pdf-ocr-poc/RESULTS.md(PoC doc 의 source —docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md와 동일 내용)/build/cache/pdf-ocr-poc/tessdata-best/(tessdata_best 모델 — Phase C 와 무관, PoC 비교 baseline 만)
Phase B 의 review 산출물 (.omc/reviews/, gitignored)
executor 가 referent 으로 read 가능:
.omc/reviews/2026-05-27-pdf-ocr-spec-critic-r1-result.md(spec round 1 critic 의 25 finding).omc/reviews/2026-05-27-pdf-ocr-spec-rewrite-report.md(spec 1c rewrite traceability).omc/reviews/2026-05-27-pdf-ocr-spec-critic-r2-result.md(spec ACCEPT).omc/reviews/2026-05-27-pdf-ocr-plan-critic-r1-result.md(plan round 1 critic 의 10 finding).omc/reviews/2026-05-27-pdf-ocr-plan-verifier-r1-result.md(plan round 1 verifier 의 20 finding).omc/reviews/2026-05-27-pdf-ocr-plan-rewrite-report.md(plan 1c rewrite traceability).omc/reviews/2026-05-27-pdf-ocr-plan-round-2-closure-result.md(plan ACCEPT)
각 brief file 도 .omc/reviews/ 안 에 있음 (worker 의 task description 의 self-contained source).
10. 환경 요약 (사용자 머신)
- working directory:
/home/altair823/kebab - main HEAD =
bcd1e37, branchfeat/pdf-scanned-ocr CARGO_TARGET_DIR=/build/out/cargo-target/target(사용자 default, 루트 디스크 보호)- 빌드 출력:
${CARGO_TARGET_DIR:-target}/release/kebab(K2 acceptance 의$RELEASE_BINalias) - Ollama remote:
http://192.168.0.47:11434(qwen2.5vl:3b + gemma4:e4b + gemma4:26b + bge-m3 + nomic-embed-text 활성). localhost 는 listening 안 함. - tessdata_best download:
/build/cache/pdf-ocr-poc/tessdata-best/(PoC reference only) - 사용자 환경의 모든 sudo 명령 = user manual (
!prefix) 만 실행 가능. - tmux session: classic
$TMUXset, in-place pane split 가능. 다음 session 도 같은 tmux session 안에 있을 것 (사용자 확인).