Files
kebab/docs/superpowers/handoffs/2026-05-27-v0.20-sub1-executor-handoff.md
altair823 b4d9e60816 chore(release): bump version 0.19.0 → 0.20.0 — v0.20.0 sub-item 1 scanned PDF OCR
# v0.20.0 — scanned PDF OCR via Ollama vision LLM

v0.20.0 의 핵심 변경 = embedded text 가 없는 scanned PDF (책 스캔, 영수증,
카메라 page) 의 OCR ingest. PoC 의 5 engine 비교 (Tesseract / EasyOCR /
PaddleOCR / gemma4:e4b / qwen2.5vl:3b) 에서 qwen2.5vl:3b 의 alnum 94.79%
(page1) / 81.56% (받침) 가 모든 다른 engine 을 능가 — 본 release 의 default
vision OCR.

## 1. OCR opt-in 사용법

`[pdf.ocr]` config 의 `enabled = true` 또는 `KEBAB_PDF_OCR_ENABLED=true` env
로 활성화. default off — OCR 한 page 당 45-100s (qwen2.5vl:3b on CPU,
remote Ollama) 의 cost 가 책 archive 외 비-OCR KB 에 부적합.

```toml
[pdf.ocr]
enabled = true
model = "qwen2.5vl:3b"
# 다른 default 는 README 참조
```

qwen2.5vl:3b 의 Ollama pull:

```bash
ollama pull qwen2.5vl:3b   # 3GB Ollama image
```

## 2. v0.19 indexed scanned PDF 의 force-reingest

v0.19 binary 로 scanned PDF 를 ingest 한 KB 는 자동으로 OCR path 진입 안
함 — parser_version "pdf-text-v1" 보존 (CLAUDE.md §Versioning cascade 의
trigger 회피 결정, H-4). 따라서 v0.20 binary upgrade + config
`pdf.ocr.enabled = true` 만 적용 시 try_skip_unchanged 의 Unchanged path 가
OCR 실행을 skip. 명시적 재처리:

```bash
kebab ingest --root /path/to/kb --force
```

## 3. DCTDecode-only v1 scope (FlateDecode / CCITTFax page 처리)

v0.20.0 의 PDF page image extract = lopdf 의 image XObject 의 /Filter ==
DCTDecode 만 cover (JPEG passthrough). 다른 encoding (FlateDecode raw
pixel, CCITTFaxDecode bilevel, JPXDecode JPEG2000) 은 warning event 발행 +
해당 page skip.

scanned PDF 의 일부 page 가 FlateDecode 또는 CCITTFax 로 encoded 시:

```bash
qpdf --object-streams=disable --recompress-flate input.pdf normalized.pdf
```

v1 의 의도 = single binary 원칙 (image crate 도입 0). v1.1+ 또는 별
sub-item 에서 multi-filter 지원 검토.

## 4. Family asymmetry (image OCR gemma4:e4b vs PDF OCR qwen2.5vl:3b)

image OCR (P6) 의 default 는 gemma4:e4b 그대로 (변경 0). PDF OCR (v0.20)
만 qwen2.5vl:3b. 사용자가 [image.ocr] model = "qwen2.5vl:3b" 으로 통일
가능 단 default 는 family asymmetric 보존.

## Dogfood + test 결과

- workspace test: 178 result lines, 0 failure.
- workspace clippy (-D warnings): exit 0.
- alnum e2e (real Ollama, manual invoke):
  - F1 (한국어 page1): 94.79% (≥ 0.85 threshold).
  - F2 (받침-intensive): 81.56% (≥ 0.70 threshold).
- integration smoke + vector PDF regression: pass.

## 변경된 surface

- new config: [pdf.ocr] (11 field) + 11 env override KEBAB_PDF_OCR_*.
- new wire: IngestEvent::PdfOcrStarted/Finished (additive minor).
- new wire: IngestItem.pdf_ocr_pages/ms_total (additive minor).
- new CLI line: "📷 OCR page N..." / "✓ OCR page N (chars chars, msms via ollama-vision)".
- new module: kebab-parse-pdf::{page_image, text_quality} + kebab-app::pdf_ocr_apply.
- dep: workspace lopdf = "0.32" 통합.
- fixture: 5 PDF (F1/F2/F4/F6/F7) under crates/kebab-parse-pdf/tests/fixtures/.

## 변경되지 않은 surface (invariant)

- Extractor::extract trait body byte-identical (PR #187).
- PdfTextExtractor body 변경 0 — post-extract enrichment pattern 으로 분리.
- parser_version "pdf-text-v1" 보존.
- chunker_version "pdf-page-v1" 보존.
- workspace.dependencies 의 production dep graph 변경 0 (-e normal baseline 보존).

## sub-item 의 11 commit history

9d7faab Step 1: foundation + cargo tree baselines
aeeff36 Step 2: lopdf /Filter probe + 5 fixture commit (F1/F2/F4/F6/F7)
fb3952d Step 2 fix: F7 conversion engine record correction
c2cd3a7 Step 3: page_image + text_quality modules (10 test)
8d81bc1 Step 3 fix: clippy pedantic in page_image
9f003ef Step 4: pdf_ocr_apply helper (10 test, F7 split + cancel)
fd918a6 Step 5: [pdf.ocr] config section + PdfOcrOpts doc
4672cba Step 5 fix: clippy::bool_assert_comparison in pdf_ocr tests
b9ee09f Step 6: wire PDF OCR enrichment + cancel propagation
4c5ccd5 Step 7: wire schema additive — IngestEvent + IngestItem + skipped
c9e0594 Step 8: CLI printer activation + ingest_progress test + spec literal
4819768 Step 9: integration smoke + vector regression + alnum e2e
1d4e301 Step 9 follow-up: Cargo.lock for dev-dep additions
90726ab Step 10: docs sync (README + HANDOFF + ARCHITECTURE + SMOKE)

## § Acceptance §9 verifier evidence

K5 의 15 row scriptable verifier 모두 green (또는 manual real-Ollama row 의 결과 보고):
- Row #4 (vector PDF byte-identical): pass.
- Row #5 (Extractor::extract trait byte-identical): 0 line diff.
- Row #6 (wire schema additive): jq + diff exit 0.
- Row #7-#8 (clippy / workspace test): exit 0.
- Row #9-#10 (dep graph baseline -e normal): empty diff.
- Row #11 (docs sync): grep evidence.
- Row #12 (version bump): "0.20.0" + Cargo.lock cascade ≥ 22.
- Row #14 (PR #187 invariant): extract_for(&asset.media_type) ≥ 1.
- Row #15 (DCTDecode-only v1, F6/F7 skip): test green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 11:03:44 +00:00

21 KiB

title, created, status, target_version, related_specs, related_plans, related_poc, prior_handoff
title created status target_version related_specs related_plans related_poc prior_handoff
v0.20.0 sub-item 1 — PDF scanned OCR Phase C (executor) handoff 2026-05-27 ready-for-executor 0.20.0
docs/superpowers/specs/2026-05-27-pdf-scanned-ocr-spec.md (ACCEPT, 1719 lines)
docs/superpowers/plans/2026-05-27-pdf-scanned-ocr-plan.md (ACCEPT, ~1100 lines)
docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md
docs/superpowers/handoffs/2026-05-26-v0.20-image-pdf-normalize-handoff.md (v0.20 sub-item 1/2/3 전체 시작점)

v0.20.0 sub-item 1 executor handoff

본 문서는 새 Claude session 에 v0.20.0 sub-item 1 (PDF scanned OCR) 의 Phase C (executor) 시작을 인계하는 self-contained context. Phase A (spec) + Phase B (plan) 가 ACCEPT 된 시점의 상태.


1. 컨텍스트 요약

1.1 v0.20 sub-item 1 의 phase 진행 상태

본 session 에서 완료된 work:

Phase Step Model Latency Verdict
PoC 한국어 OCR engine 비교 (Tesseract / EasyOCR / PaddleOCR / gemma4 / qwen2.5vl) direct main ~1.5 hour qwen2.5vl:3b 채택
Phase A spec drafter round 0 (opus) opus ~10 min 1127 lines draft
Phase A spec critic round 1 thorough opus ~8 min NEEDS_DISCUSSION HIGH 5
Phase A spec drafter round 1c rewrite opus ~14 min resolved
Phase A spec critic round 2 closure opus ~5 min ACCEPT
Phase B plan drafter round 0 opus ~17 min 890 lines draft
Phase B plan critic round 1 thorough opus ~8 min NEEDS_DISCUSSION MEDIUM 4
Phase B plan verifier round 1 opus ~8 min NEEDS_DISCUSSION HIGH 5 + MEDIUM 10
Phase B plan drafter round 1c rewrite opus ~15 min resolved 19 finding
Phase B plan round 2 closure (critic + verifier 통합) opus ~8 min ACCEPT
Phase C executor opus pending ← 새 session 시작

1.2 본 sub-item 의 user-resolved 결정 5개

PoC 단계에서 user 가 확정한 결정 (변경 불가):

  1. PDF page rendering = pdfium-render 보류, lopdf 의 image XObject stream 으로 진행 (single binary 원칙). PoC 의 architecture 재검토 단계에서 사용자가 "rust 바이너리 하나로 모든 기능" 명시 → Tesseract / pdfium 같은 native dep 회피.
  2. OCR engine = qwen2.5vl:3b via Ollama HTTP API. PoC 결과 page1 alnum 94.79% / 받침 alnum 81.56% — Tesseract (87% / 67%) / EasyOCR (90% / 74%) / gemma4:e4b (77% / 27%) 모두 능가.
  3. Default behavior = text-detect first + vision LLM fallback. (사용자 1차 always-on 결정 후 latency 측정 → reverse). pdf.ocr.always_on = true config 으로 override 가능.
  4. 한국어 OCR mitigation = vision LLM 자체로 충분 (Tesseract 의 받침 약점 / preprocessing tuning 회피).
  5. 테스트 + 튜닝 명문 = 합성 fixture (F1/F2 일반 + F3 vector + F4 mojibake + F6 FlateDecode + F7 CCITTFax) + 실측 신문 PoC (Metro Korea archive.org) + alnum metric 측정 (strsim Levenshtein + #[ignore] ocr_e2e test).

1.3 PoC 의 핵심 발견 (architecture 결정 driver)

docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md 참조. 짧게:

  • Tesseract = single binary 원칙 위반 (libtesseract + libleptonica + tessdata 외부 dep). 받침 alnum 67% — 약점 명확.
  • EasyOCR = PyTorch sidecar (~850MB) — 더 안 됨.
  • PaddleOCR = PaddlePaddle 3.0+ PIR/oneDNN runtime bug. 환경 우회 불가.
  • gemma4:e4b vision = paraphrase / hallucination. 받침 fixture 27% — Tesseract 보다 -40%p.
  • qwen2.5vl:3b = 한국어 vision OCR transcription bias. 신문 real-world test 도 본문 거의 perfect. 받침 alnum 81.56%, page1 94.79%.

architecture 정합:

  • kebab core principle (CLAUDE.md): "single binary kebab (LLM 제외)".
  • user memory project_llm_default = "OCR/caption 와 family 통일" — Ollama 가 single inference source 라서 자연 정합.
  • 단점 = latency (qwen2.5vl 45-105s/page). mitigation = text-detect first + vision fallback. 800-page 책 의 평균 cost ↓.

2. 현재 branch + uncommitted state

2.1 branch

feat/pdf-scanned-ocr (main HEAD = bcd1e37 기반)

main HEAD bcd1e37 = chore(repo): .omc/ ignore + AGENTS·GEMINI symlinks + release notes 작성 가이드 강화 (이 session 의 첫 commit).

2.2 uncommitted file (모두 신규)

docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md   (PoC baseline)
docs/superpowers/specs/2026-05-27-pdf-scanned-ocr-spec.md       (Phase A 최종, 1719 lines, ACCEPT)
docs/superpowers/plans/2026-05-27-pdf-scanned-ocr-plan.md       (Phase B 최종, ~1100 lines, ACCEPT)
docs/superpowers/handoffs/2026-05-27-v0.20-sub1-executor-handoff.md  (본 문서)

.omc/reviews/ 안의 brief + result file 들 (pdf-ocr-spec-critic-r1-result.md / pdf-ocr-plan-round-2-closure-result.md 등) 은 .gitignore.omc/ 패턴으로 자동 무시.

PoC scratch sandbox = /build/cache/pdf-ocr-poc/ (루트 디스크 보호 — 사용자 머신 정책). commit 대상 0.

2.3 commit 전략 (plan §7 + 사용자 memory)

  • plan §7 의 11 logical commit (per-step) 패턴.
  • user memory feedback_pr_workflow (gitea-pr + 리뷰 루프) — 모든 step 완료 후 단일 PR, 본 session 의 sub-item 1/2/3 패턴 그대로.
  • K6 = Step 11 의 version bump (workspace.version 0.19.0 → 0.20.0) + final verify + PR open 의 마지막 commit.

3. Phase C (executor) 의 시작 지침

3.1 핵심 reference 3 file

새 session 의 read 순서:

  1. 본 handoff (docs/superpowers/handoffs/2026-05-27-v0.20-sub1-executor-handoff.md) — 전체 context.
  2. plan (docs/superpowers/plans/2026-05-27-pdf-scanned-ocr-plan.md) — 11 step / 34 sub-action / 4-component (file path + RED test + GREEN impl + acceptance command) detail.
  3. spec (docs/superpowers/specs/2026-05-27-pdf-scanned-ocr-spec.md) — 변경 결정의 ground truth (frozen contract).

추가 reference:

  • PoC (docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md) — quality / latency baseline + architecture rationale.
  • prior handoff (docs/superpowers/handoffs/2026-05-26-v0.20-image-pdf-normalize-handoff.md) — v0.20 전체 context + sub-item 2/3 의 머지된 결과.
  • CLAUDE.md + docs/ARCHITECTURE.md — workspace 규칙.

3.2 executor 의 11 step 진행 패턴

plan §2 의 step group structure (A-K, 34 sub-action). 각 step 의 4-component (file path + RED test + GREEN impl + acceptance + commit msg draft) 가 plan §3 안에 있음.

권장 순서 (plan §1 ordering invariant):

  1. Step 1 (Group A) — Foundation + dep + baseline — spec L-1 cosmetic fix + Cargo.toml dep + .omc/state/pdf-ocr-*-deps.baseline.txt 캡처.
  2. Step 2 (Group B) — lopdf probe + fixture commit — F1/F2/F4/F6/F7 모두 commit (Phase B verifier H-4 resolution: 모든 fixture commit 이 Step 2 안). 이후 Step 3-4 test 가 commit 시점부터 GREEN-able.
  3. Step 3 (Group C) — page_image + text_qualityextract_dctdecode_page_image + compute_valid_char_ratio. RED→GREEN cycle.
  4. Step 4 (Group D) — pdf_ocr_apply helperkebab-app::pdf_ocr_applyapply_ocr_to_pdf_pages(&mut canonical, &dyn OcrEngine, &bytes, &opts). 9 integration test.
  5. Step 5 (Group F) — Config schemapdf.ocr.* (image.ocr 패턴 mirror).
  6. Step 6 (Group E) — Ingest wiring — eager init + signature update + post-extract enrichment 호출 + E4 cancel handle propagation (round 1 critic 의 finding resolution).
  7. Step 7 (Group G) — Wire schema additiveIngestEvent::PdfOcrStarted / PdfOcrFinished + IngestItem.pdf_ocr_pages / pdf_ocr_ms_total. JSON Schema 갱신.
  8. Step 8 (Group H) — In-tree consumer — CLI printer + snapshot regenerate (cargo insta).
  9. Step 9 (Group I) — Integration smoke + regression + ocr_e2eingest_pdf_ocr_smoke.rs (ingest + search + cancel 3 step) + text_extractor_regression.rs (vector PDF byte-identical) + ocr_e2e.rs (#[ignore] alnum accuracy test).
  10. Step 10 (Group J) — Docs sync — README + HANDOFF + ARCHITECTURE + SMOKE + RELEASE_NOTES (J0 pre-flight 가 path 결정).
  11. Step 11 (Group K) — Version bump + final verify + PR — K1 (Cargo.toml 0.19→0.20) + K2 (release build, $RELEASE_BIN alias) + K3 (workspace test, -j 1) + K4 (clippy) + K5 (§ Acceptance §9 15-row mapping) + K6 (commit + PR via gitea-pr).

3.3 executor 의 핵심 invariant (plan §1 + §9)

  • Extractor trait byte-identicalcrates/kebab-core/src/traits.rs 의 변경 0 (acceptance row #5).
  • PR #187 polymorphic dispatch 보존app.extract_for(&asset.media_type, &ctx, &bytes) (line 1778) 유지 — Step 6 E3 가 그 직후 post-extract enrichment 호출 (acceptance row #14).
  • single binary 원칙image crate / pdfium-render / libtesseract 도입 0. lopdf + base64 + reqwest 만 (acceptance row #9).
  • vector PDF byte-identical — F3 fixture 의 결과가 plan 변경 적용 전후 byte-identical (acceptance row #4).
  • text PDF 의 OCR 미작동pdf.ocr.enabled = false default + parser_version "pdf-text-v1" 유지 (v0.19 dogfood KB 의 try_skip_unchanged path 미변경).

3.4 잔존 risk (plan §5)

R-1 ~ R-10 (plan §5 참조):

  • R-1 F1/F2 fixture 가 DCTDecode 가 아닌 경우 — Step 2 probe 후 결정 분기 + 재합성.
  • R-2 F4 mojibake fixture 합성 reliability — fallback chain (reportlab → fpdf2 → lopdf 수작업). 최후 row-skip downgrade.
  • R-3 sub-item 2 의 normalize_provenance_timestamps helper 위치 — PR #186 머지 후 확인.
  • R-4 IngestEvent enum location (crates/kebab-app/src/ingest_progress.rs:58) — plan G1 이 명시.
  • R-7 Ollama host (http://192.168.0.47:11434) 의 qwen2.5vl:3b availability — MockOcrEngine 가 integration smoke 자동 cover, dogfood smoke 만 real Ollama 호출.
  • R-9 -j 1 workspace test 시간 (15-30 min) — K3 / K4 sequential 명문 (background 금지).

3.5 잔존 open question (plan §6)

OQ-E1 ~ OQ-E10 (plan §6) 모두 executor 의 first sub-action 의 명시적 deliverable:

  • OQ-E6 IngestEvent enum 의 serde discriminant 정합 (Step 7.1 first sub-action).
  • OQ-E7 RELEASE_NOTES.md path 결정 (Step 10 J0 pre-flight).
  • OQ-E10 ingest dispatch loop 의 pdf_ocr_engine + cancel propagation (Step 6.2 first sub-action).

4. OMC workflow pattern (사용자 memory)

새 session 이 따라야 할 핵심 memory feedback:

4.1 feedback-teammate-spawn-mode — omc-teams tmux pane

agent / teammate spawn 시 in-process Agent tool 대신 /oh-my-claudecode:omc-teams 통해 별 tmux pane 으로 spawn. 사용자가 worker 진행 실시간 모니터.

4.2 feedback-omc-teams-usage — 4.14.1 의 정확한 사용 방법

본 환경의 known limitation: one_team_per_leader_session = true (default). leader session 당 active team 하나만. config override path 미발견 — sequential single-team workflow.

spawn command:

omc team 1:claude[:role] --no-decompose "task description (first 30 char = team slug)"

brief pattern (self-contained file):

  • brief file write: .omc/reviews/<date>-<task-id>-brief.md
  • spawn command 의 task description 짧게: "Task X. Read <brief-file-absolute-path> and execute as specified. Output result to <result-file-absolute-path>"
  • worker → result file → main session 이 read.

shutdown: omc team shutdown <slug> --force (non-force 가 cli.cjs error 종종 fail).

4.3 feedback-worker-completion-polling — 자동 completion detection

worker spawn 직후 즉시 background polling shell 시작 — run_in_background=true 시 task notification 으로 main session 자동 알림:

while true; do
  phase=$(omc team status <team-slug> 2>/dev/null | grep "team=" | head -1 | grep -oE "phase=[a-z]+" | cut -d= -f2)
  if [[ "$phase" == "completed" || "$phase" == "failed" ]]; then
    echo "TEAM_DONE: phase=$phase team=<team-slug>"
    break
  fi
  sleep 20
done

4.4 feedback-teammate-model-routing — opus/sonnet 정책 (미적용)

명시 정책: executor + initial draft + round 1 review = opus, closure verify / micro-patch = sonnet. 실제: omc team CLI 의 model flag 없음, worker default = opus, sonnet 강제 path 미발견. 본 sub-item 1 은 모든 worker = opus 사용. 미래 OMC 버전에서 model flag 지원 시 재시도.

4.5 feedback-user-review-gates — skip 정책

brainstorming + writing-plans 의 사용자 confirm gate skip. spec / plan 의 self-review 만 + 바로 다음 단계. 핵심 trade-off 결정만 AskUserQuestion.

4.6 feedback-serial-build-only — cargo 직렬

cargo build / test / clippy 동시 background 금지. 직렬 진행. -j 4 default, -j 1 (workspace test/clippy) 의 메모리 link 충돌 방지.

4.7 feedback-pr-workflow — gitea-pr + 리뷰 루프

모든 task 의 default workflow. Step 11 K6 의 PR open path = gitea-pr --title ... --head feat/pdf-scanned-ocr --base main --body "$(cat <<EOF ... EOF)".

4.8 feedback-readme-sync-rule

사용자 visible surface 변경 (config schema, CLI flag, wire schema) 시 implementation PR 이 같은 PR 에서 README + HANDOFF + docs/ARCHITECTURE 세 문서 동시 갱신. Step 10 J 가 이 deliverable.

4.9 feedback-no-caveman

caveman 말투 사용 금지. 모든 응답은 자연스러운 한국어 산문.


5. Phase C 의 첫 단계 (새 session 의 entry point)

# 1. branch state 확인
git status
git log --oneline -5
git branch -vv | grep feat/pdf-scanned-ocr

# 2. uncommitted 파일 확인 (모두 신규)
ls docs/superpowers/{poc,specs,plans,handoffs}/2026-05-27-*

# 3. plan + spec read (executor 가 따를 ground truth)
cat docs/superpowers/plans/2026-05-27-pdf-scanned-ocr-plan.md | head -100
cat docs/superpowers/specs/2026-05-27-pdf-scanned-ocr-spec.md | head -100

# 4. PoC scratch 의 fixture / scripts 확인 (Phase C Step 2 의 source)
ls /build/cache/pdf-ocr-poc/{fixtures,ground-truth,scripts,images,ocr-out}/

# 5. memory 확인
cat ~/.claude/projects/-home-altair823-kebab/memory/MEMORY.md

그 후 사용자에게 묻기:

  • "Phase C executor 시작 — Step 1 (Foundation + dep + baseline) 부터 진행하시겠습니까?"
  • 또는 "executor 를 omc-teams worker 로 분리해서 진행 (sequential single-team) vs main session 에서 직접 진행?"

5.1 executor spawn pattern (권장)

memory feedback-teammate-spawn-mode 따라 omc-teams 통해 별 worker spawn. brief 의 핵심:

# executor brief — v0.20.0 sub-item 1 Step N

read: plan §3 Step N + spec §M (변경 결정의 ground truth) + 현재 코드 baseline.
follow: TDD (RED test → GREEN impl) + plan 의 4-component acceptance + commit msg draft.
single commit per step (plan §7 11-commit table).
report: `.omc/reviews/2026-05-27-pdf-ocr-executor-step-N-report.md`.

step 별 worker spawn — single-team 제약으로 한 번에 한 step. step 완료 후 shutdown + 다음 step spawn.

5.2 main session direct execution (alternative)

대안: omc-teams 의 latency / overhead 없이 main session 이 직접 11 step 진행. small step 의 경우 더 효율. 큰 step (예: Step 4 의 9 test) 만 worker 위임.

사용자 선호에 따라 결정.


6. release 계획

본 sub-item 1 완료 후:

  • workspace.version 0.19.0 → 0.20.0 (Step 11 K1).
  • v0.20.0 release notes (Step 10 J4) — CLAUDE.md §Release 절차 의 친절한 설명 rule 준수 (commit subject 단순 나열 NOT, 사용자가 이해할 수 있도록).
  • gitea-release v0.20.0 (Step 11 K6 후 또는 별).
  • sub-item 2 (TODO #2 multi-region image) / sub-item 3 (TODO #3 PDF normalize) / sub-item 4 (TODO #4 PDF figure/table) 는 별 sub-item — v0.20 의 후속 sub-item 또는 v0.21.

7. 새 session 의 검증 invariant

executor 완료 후 PR open 전 검증 (plan §9 의 15 acceptance row):

  • workspace test 회귀 0 (baseline + 27~28 new test).
  • wire schema additive only (jq + diff 검증).
  • design contract 변경 0 또는 frozen task spec 동시 갱신.
  • workspace.version minor bump (0.19 → 0.20).
  • dogfood smoke 6 step green (v0.19 → v0.20 force-reingest 시나리오 포함).
  • cargo clippy --workspace --all-targets -j 1 -- -D warnings clean.
  • cargo build --release -p kebab-cli -j 4 clean.

8. 본 handoff 의 의도

본 file 만 read 하면 새 Claude session 이 Phase C executor 시작 가능. self-contained — 본 main session 의 모든 context 보존.

cat /home/altair823/kebab/docs/superpowers/handoffs/2026-05-27-v0.20-sub1-executor-handoff.md

후 plan + spec + PoC 순으로 read → 사용자에게 첫 step 진행 확인 → Step 1 부터 진행.


9. 핵심 파일 / 경로 reference

본 sub-item 의 작업 산출물 (uncommitted)

  • docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md — PoC baseline (qwen2.5vl 채택 evidence)
  • docs/superpowers/specs/2026-05-27-pdf-scanned-ocr-spec.md — frozen spec (ACCEPT)
  • docs/superpowers/plans/2026-05-27-pdf-scanned-ocr-plan.md — executor follow-able plan (ACCEPT)
  • docs/superpowers/handoffs/2026-05-27-v0.20-sub1-executor-handoff.md — 본 handoff

Phase C 변경 대상 코드 (plan 의 file path 참조)

  • crates/kebab-parse-pdf/src/lib.rs — PdfTextExtractor body 변경 0 (invariant), 새 module export (page_image, text_quality).
  • crates/kebab-parse-pdf/src/page_image.rs (신규) — extract_dctdecode_page_image.
  • crates/kebab-parse-pdf/src/text_quality.rs (신규) — compute_valid_char_ratio.
  • crates/kebab-parse-pdf/Cargo.toml — dep 추가 (kebab-parse-image parser cross 회피 위해 NOT 추가, OcrEngine trait 의 carry path 가 caller).
  • crates/kebab-app/src/pdf_ocr.rs (신규) — apply_ocr_to_pdf_pages (post-extract enrichment helper).
  • crates/kebab-app/src/lib.rs — line 1696-1850 ingest_one_pdf_asset wiring (line 1778 의 app.extract_for(...) 직후 post-extract 호출), line 338-347 image OCR build pattern mirror.
  • crates/kebab-app/src/ingest_progress.rs line 58 — IngestEvent enum 의 새 variant.
  • crates/kebab-config/src/lib.rspdf.ocr.* config (image.ocr 패턴 mirror).
  • crates/kebab-cli/src/main.rs — ingest stdout printer 의 새 event kind mapping.

Phase C 신규 test fixtures (plan Step 2 B2)

  • crates/kebab-parse-pdf/tests/fixtures/scanned_page1.pdf (F1, DCTDecode JPEG-wrapped)
  • crates/kebab-parse-pdf/tests/fixtures/scanned_page2.pdf (F2, 받침-intensive DCTDecode)
  • crates/kebab-parse-pdf/tests/fixtures/mojibake.pdf (F4, custom font no ToUnicode CMap)
  • crates/kebab-parse-pdf/tests/fixtures/flate_raw.pdf (F6, FlateDecode skip path)
  • crates/kebab-parse-pdf/tests/fixtures/ccittfax.pdf (F7, CCITTFax skip path)
  • tests/fixtures/_synth/mojibake.py (F4 합성 script — reproducible)
  • tests/fixtures/_synth/flate_ccittfax.sh (F6/F7 합성 script)

PoC scratch (commit 대상 0, reference only)

  • /build/cache/pdf-ocr-poc/fixtures/ (page1-clean.png, page2-clean.png 등)
  • /build/cache/pdf-ocr-poc/ground-truth/ (page1.txt, page2-batchim.txt)
  • /build/cache/pdf-ocr-poc/scripts/ (make_image.py, scanned_sim.py, preprocess.py, compare.py, vision_ocr.py, etc.)
  • /build/cache/pdf-ocr-poc/ocr-out/ (engine 별 result)
  • /build/cache/pdf-ocr-poc/RESULTS.md (PoC doc 의 source — docs/superpowers/poc/2026-05-27-pdf-ocr-engine-comparison.md 와 동일 내용)
  • /build/cache/pdf-ocr-poc/tessdata-best/ (tessdata_best 모델 — Phase C 와 무관, PoC 비교 baseline 만)

Phase B 의 review 산출물 (.omc/reviews/, gitignored)

executor 가 referent 으로 read 가능:

  • .omc/reviews/2026-05-27-pdf-ocr-spec-critic-r1-result.md (spec round 1 critic 의 25 finding)
  • .omc/reviews/2026-05-27-pdf-ocr-spec-rewrite-report.md (spec 1c rewrite traceability)
  • .omc/reviews/2026-05-27-pdf-ocr-spec-critic-r2-result.md (spec ACCEPT)
  • .omc/reviews/2026-05-27-pdf-ocr-plan-critic-r1-result.md (plan round 1 critic 의 10 finding)
  • .omc/reviews/2026-05-27-pdf-ocr-plan-verifier-r1-result.md (plan round 1 verifier 의 20 finding)
  • .omc/reviews/2026-05-27-pdf-ocr-plan-rewrite-report.md (plan 1c rewrite traceability)
  • .omc/reviews/2026-05-27-pdf-ocr-plan-round-2-closure-result.md (plan ACCEPT)

각 brief file 도 .omc/reviews/ 안 에 있음 (worker 의 task description 의 self-contained source).


10. 환경 요약 (사용자 머신)

  • working directory: /home/altair823/kebab
  • main HEAD = bcd1e37, branch feat/pdf-scanned-ocr
  • CARGO_TARGET_DIR=/build/out/cargo-target/target (사용자 default, 루트 디스크 보호)
  • 빌드 출력: ${CARGO_TARGET_DIR:-target}/release/kebab (K2 acceptance 의 $RELEASE_BIN alias)
  • Ollama remote: http://192.168.0.47:11434 (qwen2.5vl:3b + gemma4:e4b + gemma4:26b + bge-m3 + nomic-embed-text 활성). localhost 는 listening 안 함.
  • tessdata_best download: /build/cache/pdf-ocr-poc/tessdata-best/ (PoC reference only)
  • 사용자 환경의 모든 sudo 명령 = user manual (! prefix) 만 실행 가능.
  • tmux session: classic $TMUX set, in-place pane split 가능. 다음 session 도 같은 tmux session 안에 있을 것 (사용자 확인).