v0.19.0 release 후 다음 session 인계용 handoff 문서 + 사후 backfill. - docs/superpowers/handoffs/2026-05-26-v0.20-image-pdf-normalize-handoff.md (540 lines, 9 section) - sub-item 1/2/3 머지 결과 + 도그푸딩 baseline (1781 doc / 9050 chunks) + user memory + OMC workflow + 빌드 환경 - 현재 구현 상태 (v0.19.0, image+pdf) — 정확한 file:line + struct/fn signature + flow - 8 TODO 상세 (problem + scope + affected files + risk + trigger 조건) - 우선순위 + sequencing 권장 + 새 session 첫 단계 제안 - docs/superpowers/specs/2026-05-26-extractor-dispatch-unification-spec.md (sub-item 3 spec) - docs/superpowers/plans/2026-05-26-extractor-dispatch-unification-plan.md (sub-item 3 plan) PR #187 머지 시 source code 만 들어가고 spec/plan 누락 — 동일 PR 의 reference link 가 main 에서 404. 본 commit 으로 backfill. Assisted-by: Claude Code
30 KiB
title, created, status, target_version, related_specs
| title | created | status | target_version | related_specs | |||
|---|---|---|---|---|---|---|---|
| v0.20.0 — Image + PDF normalize integration handoff | 2026-05-26 | ready-for-spec | 0.20.0 |
|
v0.20.0 핸드오프 — Image + PDF normalize integration
본 문서는 새 Claude session 에 v0.20.0 작업 인계하는 self-contained context. PR #185 / #186 / #187 머지 + 도그푸딩 완료 시점의 상태를 그대로 전달.
1. 컨텍스트 요약 (새 session 이 알아야 할 것)
1.1 머지된 v0.19.0 sub-item 3 PR
| PR | sub-item | scope | 영향 |
|---|---|---|---|
| #185 | 1 | kebab-source-fs 의 kebab-parse-code dep 제거 — 9 tree-sitter grammars drag 정리. 4 leaf helper → kebab-source-fs::code_meta 로 이전 |
dep graph 정리. 사용자 visible surface 변경 0. |
| #186 | 2 | kebab-normalize + kebab-parse-types 흡수 → kebab-parse-md. 24 → 22 crates. design §3.7b 4-단락 재작성. workspace.version 0.18.0 → 0.19.0 bump |
frozen design contract 변경 (tasks/INDEX.md "Future work / deferred" 섹션 신설). |
| #187 | 3 | Extractor dispatch polymorphism — App.extractors: Vec<Box<dyn Extractor + Send + Sync>> registry + App::extract_for(...) helper. 11 hardcoded callsite + 9 AST arm → 1 callsite + 4 arm |
sub-item 4 의 base. registry 가 v0.20.0 의 모든 dispatch 통합 work 의 시작점. |
main HEAD = c1e82cc (PR #187 머지 후). branch base 는 main 에서 분기.
1.2 도그푸딩 결과 (v0.19.0)
- 통합 KB: 1781 doc / 9050 chunks (기존 1770 + 새 fixture 11). 본 KB 가 v0.20.0 도그푸딩 baseline.
- 위치:
/home/altair823/KnowledgeBase/(user workspace root) + 새 fixture_dogfood-v0.19.0/ - markdown 1004 + code 7 lang (rust 13 + python 13 + go 10 + java 10 + kotlin 8 + ts 6 + js 5)
- image 770 file (기존 KB) — OCR + caption disabled (config default)
- pdf 0 file
- workspace.version 0.19.0 cascade 적용
1.3 사용자 user memory (필수 follow)
~/.claude/projects/-home-altair823-kebab/memory/MEMORY.md 확인:
- PR workflow: gitea-pr + 리뷰 루프 모드 default (단발 모드 묻지 마)
- Phase priorities: P8 audio 보류, P9 UI 우선 (책 + PDF 위주)
- LLM default: gemma4 계열 (gemma4:e4b local + gemma4:26b remote)
- Docs split: README / HANDOFF / ARCHITECTURE 세 사이블링 동시 갱신 (implementation PR)
- Ranking bias: 자동 heuristic deferred (1주+ 실사용 후 재 brainstorm)
- Skip user review gates: brainstorming / writing-plans confirm 단계 skip. 핵심 trade-off 만 AskUserQuestion.
- Serial cargo, -j 4 default: cargo 동시 background 금지. -j 4 default, -j 8 fast mode, -j 1 OOM fallback only
- No caveman style: 자연스러운 한국어 산문체
- Teammate model routing: executor + initial draft + round 1 review = opus, closure verify / micro-patch round = sonnet
1.4 OMC team workflow 패턴
sub-item 1/2/3 모두 다음 패턴 사용:
- Phase A (spec): planner (opus) 가 spec drafting → critic (opus) round 1 thorough → critic (sonnet) round 2+ closure verify → APPROVE
- Phase B (plan): planner (opus) 가 plan decompose → critic-plan + verifier-plan (opus) round 1 parallel → round 2+ sonnet closure verify → ACCEPT
- Phase C (executor): executor (opus) 가 plan step 정확 따라 구현 + 1 clean commit
- Phase D (PR): team-lead 가 gitea-pr 생성 + 회차 1 self-review APPROVE
수렴 실패 감지: round N finding 이 round N+1 closure 도 동일 위치 재등장 시 NEEDS_DISCUSSION.
1.5 빌드 / 도그푸딩 환경
- working directory:
/home/altair823/kebab - branch convention:
refactor/<sub-item-name>또는feat/<feature-name> CARGO_TARGET_DIR=/build/out/cargo-target/target- release binary:
/build/out/cargo-target/target/release/kebab(v0.19.0) - dogfood KB:
/home/altair823/KnowledgeBase/(user XDG default workspace) - merged config (include 확장):
/tmp/kebab-dogfood-merged.toml - Ollama:
http://localhost:11434(gemma3:4b + gemma4:e4b local),http://192.168.0.47:11434(remote, user config default) - 도그푸딩 fixture:
/home/altair823/KnowledgeBase/_dogfood-v0.19.0/(4 markdown + 7 code lang)
1.6 현재 구현 상태 (v0.19.0 기준, image + pdf)
새 session 이 변경 작업 시작 전 reference 로 사용. 모든 file:line 은 main HEAD = c1e82cc 기준.
1.6.1 Image — 구현된 기능
crates/kebab-parse-image/src/lib.rs:
pub const PARSER_VERSION: &str = "image-meta-v1";(line 47)pub const MAX_DECODE_DIM: u32 = 16_384;(line 51)pub struct ImageExtractor;(line 55, unit struct)impl ImageExtractor(line 57-61):pub fn new() -> Self { Self }impl Default for ImageExtractor(line 64)impl Extractor for ImageExtractor(line 69-120):fn supports(&self, m: &MediaType) -> bool { matches!(m, MediaType::Image(_)) }fn parser_version(&self) -> ParserVersion { ParserVersion("image-meta-v1".to_string()) }fn extract(&self, ctx: &ExtractContext<'_>, bytes: &[u8]) -> Result<CanonicalDocument>:dims::probe(bytes)— width / height / format 추출exif_extract::extract_whitelisted(bytes)— DateTimeOriginal / Make / Model / GPS 등 화이트리스트 only (privacy)SourceSpan::Region { x: 0, y: 0, w: width, h: height }spanid_for_doc(&asset.workspace_path, &asset.asset_id, &parser_version)으로 doc_id 생성- return 1 image-meta block + (warnings = corrupt EXIF 등)
crates/kebab-parse-image/src/ocr.rs:
pub fn apply_ocr(...)(line 82): vision LLM 호출로 image → OCR text. doc 에 OCR text block 추가 (mutating helper).pub struct OllamaVisionOcr— Ollama vision endpoint binding.OllamaVisionOcr::new(&app.config)으로 build (lib.rs:340).pub trait OcrEngine— 미래 다른 OCR backend (Tesseract / PaddleOCR) 위한 trait. 현재OllamaVisionOcr만 impl.
crates/kebab-parse-image/src/caption.rs:
pub fn apply_caption(...)(line 162): vision LLM 호출로 image → descriptive caption. doc 에 caption block 추가.pub fn caption_image(...)— internal helper.
crates/kebab-parse-image/src/lib.rs re-exports:
pub mod caption;(line 31)pub mod ocr;(line 32)pub use caption::{apply_caption, caption_image};(line 34)pub use ocr::{OcrEngine, OllamaVisionOcr, apply_ocr};(line 35)
crates/kebab-app/src/lib.rs (image ingest wiring):
use kebab_parse_image::{OllamaVisionOcr, apply_caption, apply_ocr};(line 51)let ocr_engine: Option<OllamaVisionOcr> = if app.config.image.ocr.enabled { Some(OllamaVisionOcr::new(&app.config)?) } else { None };(line 338-347)let image_pipeline = ImagePipeline { ocr_engine: ocr_engine.as_ref(), caption_llm: caption_llm.as_ref() };(line 354-361)struct ImagePipeline<'a> { ocr_engine: Option<&'a OllamaVisionOcr>, caption_llm: Option<...> }(line 756-758)- PR #187 후 ImagePipeline.extractor field 제거됨 (sub-item 3 의 Option c)
fn ingest_one_image_asset(..., image_pipeline: &ImagePipeline<'_>) -> ...(line 1227)- ingest flow (line 1283-1336):
let ctx = ExtractContext { asset: &asset, ... };(1283)let mut canonical = app.extract_for(&asset.media_type, &ctx, &bytes)?;(1296, PR #187 적용)if image_pipeline.ocr_engine.is_some() { apply_ocr(&mut canonical, ocr_engine, &bytes)? }(1308)if image_pipeline.caption_llm.is_some() { apply_caption(&mut canonical, caption_llm, &bytes)? }(1326)- chunker (
kebab-chunk::image_meta_v1) → embedder
crates/kebab-app/src/app.rs:227:
Box::new(ImageExtractor::new()),—App.extractorsregistry entry 1 of 11 (PR #187 의 polymorphic dispatch).
crates/kebab-config/src/lib.rs (image config):
image.ocr.enabled(defaultfalse, opt-in) — line 1194 assertionimage.ocr.engine="ollama-vision"(default) — line 1195image.ocr.model—gemma4:e4b(user config default 추정, env overrideKEBAB_IMAGE_OCR_MODEL)image.ocr.languages=["eng", "kor"](default)image.ocr.max_pixels=1600(default)image.caption.enabled(defaultfalse, opt-in) — line 1406image.caption.max_pixels=768(default) — line 1407image.caption.prompt_template_version="caption-v1"
crates/kebab-chunk/src/image_meta_v1.rs (chunker):
- chunker_version =
"image-meta-v1" - 1 image-meta block + (optional) 1 OCR text block + (optional) 1 caption block 을 chunk 로 변환
1.6.2 PDF — 구현된 기능
crates/kebab-parse-pdf/src/lib.rs:
pub const PARSER_VERSION: &str = "pdf-text-v1";(line 32)pub struct PdfTextExtractor;(line 37, unit struct)impl PdfTextExtractor(line 39-43):pub fn new() -> Self { Self }impl Default for PdfTextExtractor(line 45)impl Extractor for PdfTextExtractor(line 51-200+):fn supports(&self, m: &MediaType) -> bool { matches!(m, MediaType::Pdf) }fn parser_version(&self) -> ParserVersion { ParserVersion("pdf-text-v1".to_string()) }fn extract(&self, ctx: &ExtractContext<'_>, bytes: &[u8]) -> Result<CanonicalDocument>:lopdf::Document::load_mem(bytes)?— catastrophic decode guard (corrupt PDF 거부)- encrypted PDF 거부:
anyhow::bail!("encrypted PDF; remove encryption (e.g. qpdf --decrypt) before ingest") info::extract_info(&pdf_doc)— Title / Author / Subject / Keywords 등 PDF metadatapdf_doc.get_pages()BTreeMap (1-based, deterministic ordering)- per-page text 추출 + per-page
ProvenanceEvent - return
Vec<Block>(1 page = 1 block, text-only)
crates/kebab-parse-pdf/src/info.rs:
pub fn extract_info(pdf_doc: &lopdf::Document) -> Map<String, Value>— PDF metadata 추출 helper.
crates/kebab-app/src/lib.rs (pdf ingest wiring):
use kebab_parse_pdf::PdfTextExtractor;(line 54)fn ingest_one_pdf_asset(...)(line ~1699)- ingest flow (line 1777-1850):
let mut canonical = app.extract_for(&asset.media_type, &ctx, &bytes)?;(1783, PR #187 적용)- chunker (
kebab-chunk::pdf_page_v1) → embedder
- PDF 의 OCR / caption / image extract / table extract 모두 미구현 (TODO #1, #2, #4 의 대상)
crates/kebab-app/src/app.rs:228:
Box::new(PdfTextExtractor::new()),—App.extractorsregistry entry 2 of 11 (PR #187 의 polymorphic dispatch).
crates/kebab-chunk/src/pdf_page_v1.rs (chunker):
- chunker_version =
"pdf-page-v1"(verified line 303 의ParserVersion("pdf-text-v1".into())매칭) - per-page chunk (1 page = 1 chunk, large page 는 token budget 으로 분할 가능)
1.6.3 Image + PDF 의 polymorphic dispatch (PR #187 적용)
crates/kebab-app/src/app.rs (registry init):
// crates/kebab-app/src/app.rs:225-240 (Step 3 of PR #187 plan)
let extractors: Vec<Box<dyn Extractor + Send + Sync>> = vec![
Box::new(ImageExtractor::new()), // entry 1
Box::new(PdfTextExtractor::new()), // entry 2
Box::new(RustAstExtractor::new()), // entry 3
Box::new(PythonAstExtractor::new()), // entry 4
Box::new(TypescriptAstExtractor::new()), // entry 5
Box::new(JavascriptAstExtractor::new()), // entry 6
Box::new(GoAstExtractor::new()), // entry 7
Box::new(JavaAstExtractor::new()), // entry 8
Box::new(KotlinAstExtractor::new()), // entry 9
Box::new(CAstExtractor::new()), // entry 10
Box::new(CppAstExtractor::new()), // entry 11
];
crates/kebab-app/src/app.rs (App.extract_for helper):
// pub(crate) fn extract_for(...)
pub(crate) fn extract_for(
&self,
media: &MediaType,
ctx: &ExtractContext<'_>,
bytes: &[u8],
) -> anyhow::Result<CanonicalDocument> {
self.extractors
.iter()
.find(|e| e.supports(media))
.ok_or_else(|| anyhow::anyhow!("no matching extractor for media {:?}", media))?
.extract(ctx, bytes)
}
crates/kebab-app/src/app.rs (in-crate unit test, mod tests_extractor_dispatch):
- registry length = 11 test
- mutually-exclusive supports() grid 16 sample test
- extract_for "no matching extractor" error path (Audio(Wav) MediaType)
1.6.4 Image + PDF 의 wire schema (현재 v0.19.0)
ingest_progress.v1:mediafield 가"image"/"pdf"명시.asset_started/asset_finishedevent 가 chunks count 명시.ingest_report.v1.IngestItem:parser_version("image-meta-v1"/"pdf-text-v1"),chunker_version("image-meta-v1"/"pdf-page-v1"),block_count,chunk_count,warnings모두 carry.search_response.v1.SearchHit.chunker_version: 정확 dispatch ("image-meta-v1"/"pdf-page-v1").citation.v1.citation:kind: "image"/"pdf",path,line_start/line_end(PDF 의 경우 page 번호).chunk_inspection.v1.canonical_document:parser_version,last_chunker_version,last_embedding_version,blocks,provenance.events모두 dump.
1.6.5 MediaType / dispatch boundary
crates/kebab-core/src/media.rs:
pub enum MediaType { Markdown, Pdf, Image(ImageType), Audio(AudioType), Code(String), Other }ImageTypevariant:Jpeg / Png / Webp / Heic / Gif / Bmp / Tiff / Svg등 (실 enum 확인 필요)AudioType— P8 deferred, 현재 production caller 0
crates/kebab-source-fs/src/media.rs (file extension → MediaType detect):
image_extensions(jpg / jpeg / png / webp / heic / heif / gif / bmp / tiff / svg)pdf_extension(pdf only)- 다른 extension 은 markdown / code / other 로 dispatch
1.6.6 Dead struct 3 — future surface (PR #186 머지 후 보존)
crates/kebab-parse-md/src/types.rs (sub-item 2 의 absorption 후):
pub struct ParsedImageRegion;(line ~85, production caller 0 — future surface for TODO #2 multi-region)pub struct ParsedPdfPage { ... }(line ~88, production caller 0 — future surface for TODO #3 PDF normalize integration)pub struct ParsedAudioSegment { ... }(line ~94, production caller 0 — future surface for P8 audio parser)
세 struct 의 정의는 모두 kebab-parse-md 안에 정확히 보존됨. v0.20.0 의 TODO #2 / #3 에서 첫 production caller 등장 시 spec § 11 의 raison d'être 재활성화.
2. v0.20.0 의 TODO (우선순위 순)
2.1 TODO #1 — PDF scanned OCR path (HIGHEST PRIORITY — 사용자 P9 의 책+PDF 위주 정합)
문제: 현재 PdfTextExtractor (crates/kebab-parse-pdf/src/lib.rs) 는 lopdf 기반 text-only — embedded text 없는 scanned PDF (book scan / 논문 scan / 영수증 scan) 는 빈 chunk → search 결과 0.
Scope:
- Scanned PDF detect: per-page text 추출량이 threshold (e.g. 50 char / page) 미만이면 scanned 로 판정
- Render scanned page → image (pdfium-render 또는 mupdf-rs 등 의존성 검토)
- Render 된 image 를
OllamaVisionOcr으로 OCR → text block - pdf-page-v1 chunker 가 OCR text 도 cover
Affected files:
crates/kebab-parse-pdf/src/lib.rs— PdfTextExtractor 의 scanned detect logiccrates/kebab-parse-pdf/Cargo.toml— pdfium-render 또는 mupdf 의존성 추가crates/kebab-parse-pdf/src/scanned_ocr.rs(신규) — render + OCR pipelinecrates/kebab-app/src/lib.rs:1696-1850— ingest_one_pdf_asset 의 scanned-path 분기 또는 PdfTextExtractor 내부 통합- config:
pdf.ocr.enabled+pdf.ocr.engine(image.ocr 와 sibling)
Risk: PDF rendering 의존성 (pdfium 등) 의 binary size + cross-platform.
Trigger 조건: 책 / 논문 PDF 가 indexed 됐는데 search 결과 0 인 경우 (사용자 dogfood 시 실측).
2.2 TODO #2 — Multi-region image dispatch (medium priority)
문제: 현재 ImageExtractor 는 한 이미지 = 1 image-meta block + (optional) 1 OCR text block + (optional) 1 caption block. 이미지 안의 multi-region (다른 text bbox / face bbox / figure bbox 등) 분리 0.
Scope:
ParsedImageRegionstruct (이미crates/kebab-parse-md/src/types.rs:85정의됨, production caller 0)ImageExtractor::extract가Vec<ParsedImageRegion>emit 으로 변경 (multi-region detect)- region 분리 source: OCR bounding box (Tesseract 또는 PaddleOCR layout detection)
kebab-parse-md::build_canonical_document_from_image_regions(...)lift 신설- region-별 chunk → search granularity 향상
Affected files:
crates/kebab-parse-image/src/lib.rs— ImageExtractor 의 emit 변경crates/kebab-parse-image/src/regions.rs(신규) — multi-region detection logiccrates/kebab-parse-md/src/normalize.rs—build_canonical_document_from_image_regions신설crates/kebab-chunk/src/image_region_v1.rs(신규) — region-별 chunkercrates/kebab-app/src/lib.rs:1209-1336— ingest_one_image_asset 변경
Risk: region detection 의 false positive (overlap region 다중 detection, chunker dedup 필요).
Trigger 조건: image multi-region 사용 사례 (예: 명함 scan 의 name region + email region 분리 search).
2.3 TODO #3 — PDF normalize integration (medium priority)
문제: 현재 PdfTextExtractor 가 CanonicalDocument 직접 emit (normalize 우회). cross-page reference + page-level metadata 의 doc-level aggregation 어려움.
Scope:
ParsedPdfPagestruct (이미crates/kebab-parse-md/src/types.rs:88정의됨, production caller 0)PdfTextExtractor::extract가Vec<ParsedPdfPage>emit 으로 변경kebab-parse-md::build_canonical_document_from_pdf_pages(...)lift 신설 — page-별 provenance + cross-page reference graph + page-level doc-summarypdf-page-v1chunker 가 page-level metadata 도 cover
Affected files:
crates/kebab-parse-pdf/src/lib.rs— PdfTextExtractor emit 변경crates/kebab-parse-md/src/normalize.rs—build_canonical_document_from_pdf_pages신설crates/kebab-chunk/src/pdf_page_v1.rs— chunker 의 multi-page handlingcrates/kebab-app/src/lib.rs:1696-1850— ingest_one_pdf_asset 의 lift path
Risk: cross-page reference graph 의 complexity (page N → page M 의 figure 참조). false positive 시 doc-summary 가 spurious entry 가짐.
Trigger 조건: PDF cross-page navigation (e.g. "page 7 의 Figure 3 이 page 12 에서 설명됨") 의 search/ask 필요 시.
2.4 TODO #4 — Per-page image / table extraction (low-medium priority)
문제: PDF 안의 embedded image / table 추출 0. text-only flow.
Scope:
- PDF page 안의 figure / table extract (lopdf 의 image stream 추출 + table detect)
- Figure: image block 으로 변환 →
ParsedImageRegion(TODO #2 의존) - Table: structured Block (예: markdown table 또는 CSV-like) 으로 변환
pdf-table-v1chunker (신규)
Affected files:
crates/kebab-parse-pdf/src/figure.rs+table.rs(신규)crates/kebab-chunk/src/pdf_table_v1.rs(신규)
Risk: PDF figure / table 의 다양한 encoding (vector vs raster, table 의 cell merging) — robust detection 어려움.
Trigger 조건: 논문 / 보고서 PDF 의 figure / table search 필요 (현재 사용자 use case 우선순위 P9 의 책 PDF 와 정합).
2.5 TODO #5 — OCR / caption 의 Extractor trait 통합 (low priority)
문제: 현재 apply_ocr / apply_caption (crates/kebab-parse-image/src/{ocr,caption}.rs) 가 별 free function. Extractor trait 미사용. app.extract_for(image, ...) 가 base ImageExtractor 만 호출 + OCR/caption 은 callsite (ingest_one_image_asset) 후처리.
Scope (옵션 A): OCR / caption 을 별 Enricher trait 으로 모델링. App.enrichers: Vec<Box<dyn Enricher + Send + Sync>> registry 신설. app.enrich_for(doc, &media_type, &bytes) polymorphic call.
Scope (옵션 B): OCR / caption 도 Extractor 로 통합 — OcrEnricherExtractor + CaptionEnricherExtractor 가 Extractor::supports() 별 dispatch + Extractor::extract() 가 doc 의 일부만 변경. 단 spec contract violation 가능.
Affected files:
crates/kebab-parse-image/src/ocr.rs+caption.rs— Enricher trait impl 추가crates/kebab-core/src/traits.rs— Enricher trait 정의 (옵션 A)crates/kebab-app/src/app.rs—enrichersfield +enrich_forhelpercrates/kebab-app/src/lib.rs:1209-1336— ingest_one_image_asset 의 enrichment dispatch
Risk: trait 신설 = design §7.2 갱신 동반 → frozen contract 변경 → workspace.version bump trigger.
Trigger 조건: 새 enrichment 추가 시 (e.g. embedding-based image classification, depth estimation, OCR + translate)에 polymorphic registry 가 절실해질 때.
2.6 TODO #6 — MarkdownExtractor 신설 (low priority, sub-item 3 §11 future-work)
문제: 현재 markdown 만 free function path (parse_frontmatter + parse_blocks + build_canonical_document). 다른 4 parser (pdf/image/code) 는 Extractor impl 보유. markdown 만 outer 4-arm match 의 markdown arm 에서 free function 호출.
Scope:
kebab-parse-md::MarkdownExtractorstruct + Extractor trait implsupports(MediaType::Markdown)extract(ctx, bytes) -> CanonicalDocument내부에서parse_frontmatter+parse_blocks+build_canonical_document통합 호출App.extractorsregistry 의 12 entry (markdown 추가) —extract_for가 markdown 도 dispatch
Challenge (sub-item 3 의 round 1 critic 발견): markdown path 의 warning channel handover. 현재 parse_frontmatter + parse_blocks 의 Vec<Warning> 이 ingest_one_markdown_asset 안 별도 snap → IngestItem.warnings (wire ingest_report.v1) 로 흐름. MarkdownExtractor::extract(&ctx, &bytes) -> Result<CanonicalDocument> signature 가 별 channel 부재 → wire schema 영향 가능.
Affected files:
crates/kebab-parse-md/src/extractor.rs(신규)crates/kebab-parse-md/src/lib.rs—pub mod extractor;+pub use crate::extractor::MarkdownExtractor;crates/kebab-app/src/app.rs:225-240— registry entry 추가crates/kebab-app/src/lib.rs:1083-1170— markdown ingest path 의 helper signature 변경 폭 정확 measurement
Risk: wire schema (ingest_report.v1.IngestItem.warnings) 변경 위험. wire schema additive minor bump v1.x 또는 trait signature 변경 v2.
Trigger 조건: markdown 의 polymorphic dispatch 가 절실해지는 use case (예: 새 markdown variant — kramdown / CommonMark + extensions).
2.7 TODO #7 — Chunker dispatch unification (low priority, sub-item 3 §11 future-work)
문제: 현재 Chunker trait 에 supports() method 없음. kebab-app/src/lib.rs 의 chunker dispatch (parser_version 11-arm + chunker_version 11-arm + tier3_fallback_cv 2-arm + chunk dispatch 13-arm) 가 inner match.
Scope:
Chunkertrait 에supports(media: &MediaType) -> bool또는supports(chunker_version: &str) -> boolmethod 추가App.chunkers: Vec<Box<dyn Chunker + Send + Sync>>registryapp.chunk_for(...)polymorphic call- inner 4-arm match 제거
Affected files:
crates/kebab-core/src/traits.rs— Chunker trait 의 method 추가- 모든 Chunker impl (15곳,
kebab-chunk안 md-heading / pdf-page / 9 code-ast / code-text-paragraph / manifest-file / dockerfile-file / k8s-manifest-resource) 의 supports() 추가 crates/kebab-app/src/app.rs— chunkers field + chunk_for helpercrates/kebab-app/src/lib.rs:1935-2128— 4 inner match 제거
Risk: design §7.2 갱신 동반 (frozen contract 변경, workspace.version bump).
Trigger 조건: 새 chunker variant 추가 시 (e.g. semantic chunker, sentence-level chunker) 의 polymorphic registry 가 절실해질 때.
2.8 TODO #8 — outer 4-arm match 통합 (low priority, sub-item 3 §11)
문제: ingest_one_asset (lib.rs:961-1040) 의 match &asset.media_type 4-arm (markdown / pdf / image / code) 이 hardcoded helper 호출 (ingest_one_markdown_asset / ingest_one_pdf_asset / ingest_one_image_asset / ingest_one_code_asset).
Scope:
- 각 medium 별 helper 의 unified signature 도입 —
IngestEnv같은 context object 로 helper 의 signature 통일 app.ingest_one(...)polymorphic call
Risk: helper 별 다른 enrichment (image 의 OCR/caption + pdf 의 lopdf + code 의 lang detect) 의 통합 어려움 → 큰 refactor.
Trigger 조건: 새 medium 추가 시 (e.g. audio P8 또는 video P+) 의 통합 dispatch 가 절실해질 때.
3. 우선순위 + Sequencing 권장
| 순서 | TODO | 효과 | 의존성 |
|---|---|---|---|
| 1 | TODO #1 (PDF scanned OCR) | 사용자 P9 책+PDF use case 직접 unblock | OllamaVisionOcr (이미 구현됨) |
| 2 | TODO #3 (PDF normalize integration) | cross-page reference + doc-summary 가능 | TODO #1 와 분리 가능 |
| 3 | TODO #2 (multi-region image) | image search granularity 향상 | OCR bbox detection 필요 |
| 4 | TODO #4 (PDF figure/table) | 논문/보고서 PDF 의 figure/table search | TODO #2 + #3 |
| 5 | TODO #5 (OCR/caption Enricher trait) | architecture cleanup | TODO #1, #2 의 enrichment 가 stable 후 |
| 6 | TODO #6 (MarkdownExtractor 신설) | dispatch unification | wire schema 변경 필요 |
| 7 | TODO #7 (Chunker dispatch) | dispatch unification | design §7.2 갱신 |
| 8 | TODO #8 (outer 4-arm 통합) | full polymorphic dispatch | TODO #6 + #7 |
권장: TODO #1 + #3 을 v0.20.0 의 첫 두 sub-item 으로 시작 (사용자 use case 직접 unblock). TODO #2 + #4 는 v0.20.0 의 sub-item 3, 4. TODO #5 ~ #8 은 v0.21+ defer (sub-item 3 의 §11 future-work 에서 이미 별 PR defer 명시).
4. 새 session 의 첫 단계 (제안)
-
상태 확인:
git status+git log --oneline -5확인 (HEAD =c1e82cc또는 그 이후)cat ~/.claude/projects/-home-altair823-kebab/memory/MEMORY.mdcat docs/superpowers/handoffs/2026-05-26-v0.20-image-pdf-normalize-handoff.md(본 문서)
-
사용자에게 첫 sub-item 선택 확인:
- "TODO #1 (PDF scanned OCR) 부터 시작?" — 가장 사용자 use case 직접
- 또는 "다른 우선순위?"
-
선택된 TODO 의 spec drafting:
- 새 team 생성 (
gitea-prworkflow 의 branch convention 따라) - planner (opus) spawn
- sub-item 1/2/3 의 spec/plan 패턴 그대로
- critic round 1 = opus, round 2+ = sonnet (model routing 정책)
- 새 team 생성 (
-
spec → plan → executor → PR + 리뷰 루프 진행 (이전 sub-item 패턴).
5. 핵심 파일 / 자료 / 의존성 위치
Codebase 위치 (변경 대상)
crates/kebab-parse-image/src/lib.rs— ImageExtractor (line 55)crates/kebab-parse-image/src/ocr.rs— OllamaVisionOcr + apply_ocrcrates/kebab-parse-image/src/caption.rs— apply_captioncrates/kebab-parse-pdf/src/lib.rs— PdfTextExtractor (line 37)crates/kebab-parse-md/src/types.rs— ParsedImageRegion + ParsedPdfPage + ParsedAudioSegment (dead struct 3, future surface)crates/kebab-parse-md/src/normalize.rs— build_canonical_document (line 60)crates/kebab-core/src/traits.rs— Extractor + Chunker trait (line 115-132)crates/kebab-app/src/app.rs:225-240— App.extractors registry (sub-item 3 의 11 entry)crates/kebab-app/src/lib.rs:961-1040— ingest_one_asset outer 4-arm matchcrates/kebab-app/src/lib.rs:1209-1336— ingest_one_image_asset (OCR / caption flow)crates/kebab-app/src/lib.rs:1696-1850— ingest_one_pdf_assetcrates/kebab-chunk/src/pdf_page_v1.rs— PDF chunkercrates/kebab-config/src/lib.rs:861-1430— image.ocr / image.caption config
Spec / Plan / Doc 위치 (참조 대상)
docs/superpowers/specs/2026-04-27-kebab-final-form-design.md— frozen design contract (§3.7b 4-단락 재작성, §7.2 trait, §8 dep graph)docs/superpowers/specs/2026-05-26-normalize-absorption-spec.md— sub-item 2 의 §11 future-workdocs/superpowers/specs/2026-05-26-extractor-dispatch-unification-spec.md— sub-item 3 의 §11 future-workdocs/superpowers/plans/2026-05-26-*-plan.md— 3 sub-item 의 plantasks/INDEX.md— "Future work / deferred" 섹션 (sub-item 2 머지 시 신설)tasks/HOTFIXES.md— design deviation entry (sub-item 2)docs/SMOKE.md— 도그푸딩 절차docs/ARCHITECTURE.md— crate dependency graph + locked-in decisionsHANDOFF.md— phase-level progress dashboardREADME.md— end-user facing surface
외부 의존성 후보 (TODO #1 의 PDF rendering)
pdfium-rendercrate — Chrome PDFium binding, robustmupdfcrate — MuPDF binding, lighterlopdf(이미 사용) 의 image stream 추출 (TODO #4)
6. 모델 routing 정책 (사용자 결정 적용)
새 session 의 모든 teammate spawn 시:
| Phase | Role | Model |
|---|---|---|
| Phase A | planner (spec drafter) | opus |
| Phase A | critic round 1 (thorough) | opus |
| Phase A | critic round 2+ (closure verify) | sonnet |
| Phase B | planner (plan decompose) | opus |
| Phase B | critic-plan + verifier-plan round 1 | opus |
| Phase B | critic-plan + verifier-plan round 2+ | sonnet |
| Phase C | executor | opus |
| Phase D | team-lead (self) | — |
memory feedback_teammate_model_routing 참조.
7. release 계획
- 현 v0.19.0 binary 가 main HEAD
c1e82cc기반 (PR #185 + #186 + #187 머지 후). 도그푸딩 완료. gitea-release v0.19.0진행 권장 (sub-item 1/2/3 통합) — 사용자가 release 명시 안 했으면 본 session 의 v0.20.0 sub-item 시작 전에 release tag 컷.- v0.20.0 의 TODO #1, #3 머지 후 다음 release 검토.
8. 검증 invariant (모든 v0.20.0 sub-item 의 acceptance)
- workspace test 회귀 0 (baseline = 1316 + 새 sub-item 의 신규 test 만큼 + delta)
- wire schema 변경 0 또는 minor additive bump (
*.v1) - design contract 변경 시 frozen task spec 모두 갱신 또는 HOTFIXES live source 추가
- v0.20.0 minor bump (frozen design contract 변경 시) 또는 patch bump (refactor only)
- 도그푸딩 KB 의 byte-identical 결과 보존 (success path)
cargo clippy --workspace --all-targets -- -D warningscleancargo build --release -p kebab-cli -j 4clean
9. 본 handoff 문서의 위치 + 의도
docs/superpowers/handoffs/2026-05-26-v0.20-image-pdf-normalize-handoff.md
새 Claude session 이 본 file 만 read 해도 sub-item 시작 가능. 이 session 의 모든 context (sub-item 1/2/3 결과 + 도그푸딩 + user memory + OMC workflow) 가 self-contained.
새 session 시작 시:
cat /home/altair823/kebab/docs/superpowers/handoffs/2026-05-26-v0.20-image-pdf-normalize-handoff.md
후 사용자에게 첫 sub-item 우선순위 확인.