refactor(app): extract dispatch polymorphism — App.extract_for(...) + 11 Extractor registry #187

Merged
altair823 merged 1 commits from refactor/extractor-dispatch-unification into main 2026-05-26 21:07:22 +00:00
Owner

요약

kebab-app 의 hardcoded *Extractor::new().extract(...) callsite 11 곳 + 9 AST arm match code_lang 분기를 App::extract_for(&MediaType, &ExtractContext, &[u8]) 단일 polymorphic call 로 통합. Extractor trait 의 dead polymorphism (trait 존재하나 vtable dispatch 미사용) 해소.

본 PR scope = AST 9-arm + image + pdf extract callsite only. MarkdownExtractor 신설 / Tier 2/3 free function / outer 4-arm match &asset.media_type / inner 4-arm match (parser_version / chunker_version / tier3_fallback_cv / chunk dispatch) / Chunker dispatch unification 은 모두 별 PR future-defer (spec §11).

설계: docs/superpowers/specs/2026-05-26-extractor-dispatch-unification-spec.md (2 round APPROVE — round 1 opus thorough + round 2 sonnet closure verify)
계획: docs/superpowers/plans/2026-05-26-extractor-dispatch-unification-plan.md (3 round APPROVE — round 1 opus + round 2/3 sonnet)

핵심 변경

  • App.extractors: Vec<Box<dyn Extractor + Send + Sync>> field 신설 + pub(crate) fn extract_for(...) helper method. App::open_with_config 의 registry init = 11 Extractor (image + pdf + 9 AST: rust / python / typescript / javascript / go / java / kotlin / c / cpp).
  • ImagePipeline.extractor field 제거 (Option c, atomic block 1) — extractor: &'a ImageExtractor field 제거 + lib.rs:356 local image_extractor 제거 + lib.rs:1235 alias 제거. ImagePipeline 은 ocr_engine + caption_llm 만 carry.
  • 3 callsite migration:
    • lib.rs:1289 image extract — image_extractor.extract(...)app.extract_for(&asset.media_type, &ctx, &bytes)?
    • lib.rs:1777 pdf extract — PdfTextExtractor::new().extract(...)app.extract_for(...)
    • lib.rs:2012-2047 9 AST arm hoist — *Extractor::new().extract(...) 9 callsite → 1 callsite (12 explicit+wildcard arm → 4 arm: 9 AST grouped + 7 manifest grouped + 1 shell + 1 other-bail)
  • in-crate unit test (3 class)crates/kebab-app/src/app.rs#[cfg(test)] mod tests_extractor_dispatch block:
    • registry length = 11
    • mutually-exclusive supports() grid 16 sample
    • extract_for "no matching extractor" error path (Audio(Wav) MediaType)

OMC review history

Phase A (spec, 2 round):

  • planner (spec drafter, analyst-style 12-step investigation): 611 lines initial draft
  • critic round 1 (opus thorough): 22 finding (2 CRITICAL + 7 MAJOR + 4 MINOR + 2 NIT + 4 missing + 2 ambiguity + 3 OQ) — BodyHints 4 field 부정확 / byte-identical 인용 과장 / Pattern β 의 warning channel handover risk → Option (ii) MarkdownExtractor defer (핵심 결정) / parser_version dual-source / ImagePipeline.extractor lifecycle / code_lang count 17 정정 / arm count 정정 / kebab-parse-md re-export
  • critic round 2 (sonnet closure verify): APPROVE (21 fully + 1 partial MAJOR #6 arm count — plan 단계 처리)

Phase B (plan, 3 round):

  • planner (11-step decompose): 587 → 920 → 935 lines (round 1 → 2 → 3)
  • critic-plan + verifier-plan round 1 (opus thorough): 1 CRITICAL + 5 MAJOR + 6 MINOR + 1 NIT + 4 missing + 1 ambiguity (critic) + 0 BLOCKER + 3 MAJOR + 6 MINOR (verifier) — Step 6 unit test pub(crate) 접근 불가 (integration test 위치) / test code workspace_root lifetime / test_fixtures 부재 / Step 4 retroactive 수정 / wire baseline cmd 부재 / Step 3 destructure 누락 / error.v1 wire scope
  • critic-plan + verifier-plan round 2 (sonnet closure verify): REQUEST_CHANGES — false-negative 3건 발견 (planner v2 → v3 race condition, reviewer 가 v2 baseline 으로 misread) + spec §3.7 arm count 13 → 12 정정 필요
  • critic-plan + verifier-plan round 3 (sonnet micro verify): ACCEPT (verifier) + REQUEST_CHANGES (critic, 1 partial — 직접 verify 결과 spec §3.7 이미 12 정정 완료, finding 도 false-negative)
  • 최종: 21 finding closure + 3 false-negative finding (v2 → v3 race condition) + spec §3.7 정정

Implementation (executor):

  • 11 step + 1 clean commit (2c05dbd)
  • 2 file changed (app.rs +186/-2, lib.rs +16/-43)
  • 11 verification gate 모두 PASS
  • workspace 1316 tests PASS (baseline 1313 + 3 new, delta = +3)
  • release binary clean (0 warning)

검증

11 acceptance gate (plan §4 + Step 11 exit gate)

  • parser- sources unchanged*: git diff main..HEAD -- crates/kebab-parse-* = 0 line (trait surface + parser source 변경 0)
  • App.extractors + extract_for shape: app.rs 의 field 1 + method 1 정합
  • registry 11 entry: Box::new(...Extractor::new(...)) × 11 매칭 (image + pdf + 9 AST)
  • callsite count:
    • app.extract_for(...) in lib.rs = 3 (image + pdf + AST grouped)
    • 9 AST *Extractor::new().extract 잔존 = 0
    • image_extractor 잔존 = 0
    • image_pipeline.extractor 잔존 = 0
  • code dispatch arm count: 12 arm (11 explicit + 1 wildcard) → 4 arm (9 AST grouped + 7 manifest grouped + 1 shell + 1 other-bail)
  • ImagePipeline.extractor field 제거: extractor: &'a ImageExtractor 잔존 = 0
  • cargo build --workspace -j 4: clean
  • cargo clippy --workspace --all-targets -j 1 -- -D warnings: clean
  • cargo test --workspace --no-fail-fast -j 1: 1316 PASS / 0 failed (baseline 1313 + 3 new in-crate unit tests, delta = +3)
  • cargo tree kebab-parse- count*: 4 (md / pdf / image / code 모두 보존)
  • workspace.members count: 22 (PR #186 후 상태 보존)
  • cargo build --release -p kebab-cli -j 4: clean
  • wire diff success path: ingest_report.v1 + search_response.v1 byte-identical (jq filter indexed_at / duration_ms strip 후 diff = 0)

Wire schema 변경 0 (success path)

  • ingest_report.v1 / search_response.v1 / answer.v1 byte-identical 유지
  • error.v1.message 의 internal context string wording 변경은 spec §5.5 risk acceptance (error.v1.code + error.v1.schema_version 보존 — user-visible surface 정의 외)

v2 → v3 race condition (post-mortem)

plan round 2 critic + verifier 의 finding 3건 (CRITICAL #1 test 위치 / MAJOR #5 lib.rs:1235 / RawAsset field) 이 false-negative — planner v2 (754 lines) → v3 (920 lines) revision 시점에 reviewer spawn 발생, reviewer 가 v2 baseline 으로 misread. Round 3 grep cross-check 로 v3 actual content 가 이미 closure 됨 확인.

이 race condition lesson 은 future spec/plan revision 에서 reviewer spawn 전에 latest line count + version snapshot 명시 권장.

Wire / 변경 없음

  • wire schema (success path): 변경 0 (ingest_report.v1 / search_response.v1 / answer.v1 byte-identical)
  • wire schema (error path): error.v1.message internal context wording 변경 (spec §5.5 risk acceptance, error.v1.code 보존)
  • CLI / TUI / MCP user-facing surface: 변경 0
  • Cargo workspace.package.version: bump 불필요 (refactor, 본 PR 의 변경 0). main HEAD 가 이미 0.19.0 (PR #186 cascade)
  • trait source (kebab-core/src/traits.rs): 변경 0
  • parser source (kebab-parse-md / -pdf / -image / -code): 변경 0
  • design contract (docs/superpowers/specs/2026-04-27-kebab-final-form-design.md): 변경 0 (contract_sections: [])
  • HOTFIXES.md / HANDOFF.md / README.md / ARCHITECTURE.md / tasks/INDEX.md: 변경 0 (사용자 visible surface 0 + design 변경 0)
  • Cargo features: 변경 0
  • parser_version cascade: 변경 0

시험 항목 (Test Plan)

  • cargo test --workspace --no-fail-fast -j 1 → 1316 PASS (baseline 1313 + 3 new)
  • cargo build --release -p kebab-cli -j 4 → green
  • cargo clippy --workspace --all-targets -j 1 -- -D warnings → clean
  • cargo tree -p kebab-app -e normal | grep "kebab-parse-" | wc -l → 4
  • 3 callsite-count grep: app.extract_for ≥ 3 / 9 AST *Extractor::new().extract = 0 / image_extractor = 0
  • ingest happy path smoke (markdown / pdf / image / 1+ code lang) — kebab ingest --json output schema 보존

Assisted-by: Claude Code

## 요약 `kebab-app` 의 hardcoded `*Extractor::new().extract(...)` callsite 11 곳 + 9 AST arm `match code_lang` 분기를 `App::extract_for(&MediaType, &ExtractContext, &[u8])` 단일 polymorphic call 로 통합. `Extractor` trait 의 dead polymorphism (trait 존재하나 vtable dispatch 미사용) 해소. 본 PR scope = **AST 9-arm + image + pdf extract callsite only**. MarkdownExtractor 신설 / Tier 2/3 free function / outer 4-arm `match &asset.media_type` / inner 4-arm match (parser_version / chunker_version / tier3_fallback_cv / chunk dispatch) / Chunker dispatch unification 은 **모두 별 PR future-defer** (spec §11). 설계: docs/superpowers/specs/2026-05-26-extractor-dispatch-unification-spec.md (2 round APPROVE — round 1 opus thorough + round 2 sonnet closure verify) 계획: docs/superpowers/plans/2026-05-26-extractor-dispatch-unification-plan.md (3 round APPROVE — round 1 opus + round 2/3 sonnet) ## 핵심 변경 - **`App.extractors: Vec<Box<dyn Extractor + Send + Sync>>` field 신설** + `pub(crate) fn extract_for(...)` helper method. `App::open_with_config` 의 registry init = 11 Extractor (image + pdf + 9 AST: rust / python / typescript / javascript / go / java / kotlin / c / cpp). - **ImagePipeline.extractor field 제거** (Option c, atomic block 1) — `extractor: &'a ImageExtractor` field 제거 + lib.rs:356 local `image_extractor` 제거 + lib.rs:1235 alias 제거. ImagePipeline 은 `ocr_engine` + `caption_llm` 만 carry. - **3 callsite migration**: - lib.rs:1289 image extract — `image_extractor.extract(...)` → `app.extract_for(&asset.media_type, &ctx, &bytes)?` - lib.rs:1777 pdf extract — `PdfTextExtractor::new().extract(...)` → `app.extract_for(...)` - lib.rs:2012-2047 9 AST arm hoist — `*Extractor::new().extract(...)` 9 callsite → 1 callsite (12 explicit+wildcard arm → 4 arm: 9 AST grouped + 7 manifest grouped + 1 shell + 1 other-bail) - **in-crate unit test (3 class)** — `crates/kebab-app/src/app.rs` 의 `#[cfg(test)] mod tests_extractor_dispatch` block: - registry length = 11 - mutually-exclusive `supports()` grid 16 sample - extract_for "no matching extractor" error path (Audio(Wav) MediaType) ## OMC review history **Phase A (spec, 2 round)**: - planner (spec drafter, analyst-style 12-step investigation): 611 lines initial draft - critic round 1 (opus thorough): 22 finding (2 CRITICAL + 7 MAJOR + 4 MINOR + 2 NIT + 4 missing + 2 ambiguity + 3 OQ) — BodyHints 4 field 부정확 / byte-identical 인용 과장 / Pattern β 의 warning channel handover risk → **Option (ii) MarkdownExtractor defer (핵심 결정)** / parser_version dual-source / ImagePipeline.extractor lifecycle / code_lang count 17 정정 / arm count 정정 / kebab-parse-md re-export - critic round 2 (sonnet closure verify): APPROVE (21 fully + 1 partial MAJOR #6 arm count — plan 단계 처리) **Phase B (plan, 3 round)**: - planner (11-step decompose): 587 → 920 → 935 lines (round 1 → 2 → 3) - critic-plan + verifier-plan round 1 (opus thorough): 1 CRITICAL + 5 MAJOR + 6 MINOR + 1 NIT + 4 missing + 1 ambiguity (critic) + 0 BLOCKER + 3 MAJOR + 6 MINOR (verifier) — Step 6 unit test `pub(crate)` 접근 불가 (integration test 위치) / test code workspace_root lifetime / test_fixtures 부재 / Step 4 retroactive 수정 / wire baseline cmd 부재 / Step 3 destructure 누락 / error.v1 wire scope - critic-plan + verifier-plan round 2 (sonnet closure verify): REQUEST_CHANGES — false-negative 3건 발견 (planner v2 → v3 race condition, reviewer 가 v2 baseline 으로 misread) + spec §3.7 arm count 13 → 12 정정 필요 - critic-plan + verifier-plan round 3 (sonnet micro verify): ACCEPT (verifier) + REQUEST_CHANGES (critic, 1 partial — 직접 verify 결과 spec §3.7 이미 12 정정 완료, finding 도 false-negative) - 최종: 21 finding closure + 3 false-negative finding (v2 → v3 race condition) + spec §3.7 정정 **Implementation (executor)**: - 11 step + 1 clean commit (`2c05dbd`) - 2 file changed (app.rs +186/-2, lib.rs +16/-43) - 11 verification gate 모두 PASS - workspace 1316 tests PASS (baseline 1313 + 3 new, delta = +3) - release binary clean (0 warning) ## 검증 ### 11 acceptance gate (plan §4 + Step 11 exit gate) - **parser-* sources unchanged**: `git diff main..HEAD -- crates/kebab-parse-*` = **0 line** (trait surface + parser source 변경 0) - **App.extractors + extract_for shape**: app.rs 의 field 1 + method 1 정합 - **registry 11 entry**: `Box::new(...Extractor::new(...))` × **11** 매칭 (image + pdf + 9 AST) - **callsite count**: - `app.extract_for(...)` in lib.rs = **3** (image + pdf + AST grouped) - 9 AST `*Extractor::new().extract` 잔존 = **0** - `image_extractor` 잔존 = **0** - `image_pipeline.extractor` 잔존 = **0** - **code dispatch arm count**: **12 arm (11 explicit + 1 wildcard) → 4 arm** (9 AST grouped + 7 manifest grouped + 1 shell + 1 other-bail) - **ImagePipeline.extractor field 제거**: `extractor: &'a ImageExtractor` 잔존 = **0** - **cargo build --workspace -j 4**: clean - **cargo clippy --workspace --all-targets -j 1 -- -D warnings**: clean - **cargo test --workspace --no-fail-fast -j 1**: 1316 PASS / 0 failed (baseline 1313 + 3 new in-crate unit tests, delta = +3) - **cargo tree kebab-parse-* count**: **4** (md / pdf / image / code 모두 보존) - **workspace.members count**: **22** (PR #186 후 상태 보존) - **cargo build --release -p kebab-cli -j 4**: clean - **wire diff success path**: `ingest_report.v1` + `search_response.v1` byte-identical (jq filter `indexed_at` / `duration_ms` strip 후 diff = 0) ### Wire schema 변경 0 (success path) - `ingest_report.v1` / `search_response.v1` / `answer.v1` byte-identical 유지 - `error.v1.message` 의 internal context string wording 변경은 spec §5.5 **risk acceptance** (`error.v1.code` + `error.v1.schema_version` 보존 — user-visible surface 정의 외) ### v2 → v3 race condition (post-mortem) plan round 2 critic + verifier 의 finding 3건 (CRITICAL #1 test 위치 / MAJOR #5 lib.rs:1235 / RawAsset field) 이 **false-negative** — planner v2 (754 lines) → v3 (920 lines) revision 시점에 reviewer spawn 발생, reviewer 가 v2 baseline 으로 misread. Round 3 grep cross-check 로 v3 actual content 가 이미 closure 됨 확인. 이 race condition lesson 은 future spec/plan revision 에서 reviewer spawn 전에 latest line count + version snapshot 명시 권장. ## Wire / 변경 없음 - **wire schema (success path)**: 변경 0 (ingest_report.v1 / search_response.v1 / answer.v1 byte-identical) - **wire schema (error path)**: `error.v1.message` internal context wording 변경 (spec §5.5 risk acceptance, `error.v1.code` 보존) - **CLI / TUI / MCP user-facing surface**: 변경 0 - **Cargo workspace.package.version**: bump 불필요 (refactor, 본 PR 의 변경 0). main HEAD 가 이미 0.19.0 (PR #186 cascade) - **trait source (`kebab-core/src/traits.rs`)**: 변경 0 - **parser source (`kebab-parse-md / -pdf / -image / -code`)**: 변경 0 - **design contract (`docs/superpowers/specs/2026-04-27-kebab-final-form-design.md`)**: 변경 0 (`contract_sections: []`) - **HOTFIXES.md / HANDOFF.md / README.md / ARCHITECTURE.md / tasks/INDEX.md**: 변경 0 (사용자 visible surface 0 + design 변경 0) - **Cargo features**: 변경 0 - **`parser_version` cascade**: 변경 0 ## 시험 항목 (Test Plan) - [ ] `cargo test --workspace --no-fail-fast -j 1` → 1316 PASS (baseline 1313 + 3 new) - [ ] `cargo build --release -p kebab-cli -j 4` → green - [ ] `cargo clippy --workspace --all-targets -j 1 -- -D warnings` → clean - [ ] `cargo tree -p kebab-app -e normal | grep "kebab-parse-" | wc -l` → 4 - [ ] 3 callsite-count grep: `app.extract_for` ≥ 3 / 9 AST `*Extractor::new().extract` = 0 / `image_extractor` = 0 - [ ] ingest happy path smoke (markdown / pdf / image / 1+ code lang) — `kebab ingest --json` output schema 보존 Assisted-by: Claude Code
altair823 added 1 commit 2026-05-26 17:46:09 +00:00
kebab-app 의 hardcoded extract dispatch (`ImageExtractor` + `PdfTextExtractor` + 9 AST `*Extractor` 의 `::new().extract(…)` callsite 11곳 + 9 AST arm match) 를 `App::extract_for(&MediaType, &ExtractContext, &[u8])` 단일 polymorphic call 로 통합. trait 변경 0, parser source 변경 0, wire schema 변경 0 (success path).

핵심 변경:
- App struct 에 `pub(crate) extractors: Vec<Box<dyn Extractor + Send + Sync>>` field + `pub(crate) fn extract_for(...)` helper method.
- App::open_with_config 의 registry init = 11 Extractor (image + pdf + 9 AST).
- ImagePipeline struct 의 `extractor: &'a ImageExtractor` field 제거 + lib.rs:356 local + lib.rs:1235 alias 삭제 (atomic block).
- 9 AST arm (lib.rs:2012-2047 의 12 arm = 11 explicit + 1 wildcard) → 4 arm (9 AST grouped + 7 manifest + 1 shell + 1 other-bail).
- in-crate unit test (app.rs 의 `mod tests_extractor_dispatch`) 3 class: registry length 11 / mutually-exclusive supports() grid (16 sample MediaType) / extract_for error path (Audio).

scope = AST 9-arm + image + pdf extract callsite only. MarkdownExtractor / Tier 2/3 / outer 4-arm / inner 4 match / Chunker dispatch 모두 future-defer (별 PR — spec §11).

Wire schema (success path) 변경 0 — ingest_report.v1 / search_response.v1 / answer.v1 byte-identical (4-medium SMOKE 비교 검증). error.v1.message 의 internal context string wording 변경 (예: `kb-parse-image::ImageExtractor::extract` → `kb-app::extract_for (image)`) 은 spec §5.5 risk acceptance — `error.v1.code` + `error.v1.schema_version` 보존, user-visible surface 외. Cargo workspace.version bump 0.

Refs:
- docs/superpowers/specs/2026-05-26-extractor-dispatch-unification-spec.md (2 round APPROVE)
- docs/superpowers/plans/2026-05-26-extractor-dispatch-unification-plan.md (3 round APPROVE)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
claude-reviewer-01 approved these changes 2026-05-26 17:47:03 +00:00
claude-reviewer-01 left a comment
Member

회차 1 — Extractor dispatch polymorphism refactor 검토.

OMC team extractor-dispatch-unification 의 2-round spec APPROVE (planner spec drafter + opus thorough critic round 1 + sonnet closure verify round 2) + 3-round plan ACCEPT (planner → 3 critic-plan + 3 verifier-plan) 의 모든 closure 결과가 본 PR 의 코드에 정확히 reflect. 11 verification gate 모두 green + workspace 1316 tests (baseline 1313 + 3 new in-crate unit tests, delta = +3) + wire schema success path byte-identical.

칭찬 (산문):

  1. Pattern β scope 의 정확한 정렬 — spec round 1 critic 의 MAJOR #2 발견 (Pattern β 의 warning channel handover risk → wire schema diff 0 깨질 위험) 가 Option (ii) (MarkdownExtractor 신설 defer) 채택으로 closure. 본 PR scope 가 "AST 9-arm + image + pdf extract callsite only" 로 정확히 정렬 — markdown path 변경 0 + warning channel handover 회피. 결과적으로 ingest_report.v1.IngestItem.warnings byte-identical 유지.

  2. App.extractors registry + extract_for helper 의 single-owner pattern — Option A 채택 (별 ExtractorRegistry struct 가 아닌 App 의 field). state-less Extractor 11 entry init cost 0 + future plugin system 의 migration path (OnceLock wrapper 또는 Vec element 만 wrapping) 보존. pub(crate) visibility 가 facade rule 정합 + in-crate unit test 의 access 가능.

  3. Atomic block 1 의 ordering invariant — Step 3-4-5-6 의 image dispatch migration (lib.rs:1235 alias 삭제 → lib.rs:356 local 삭제 → ImagePipeline.extractor field 제거 → lib.rs:1289 callsite 교체) 4 edit 가 한 atomic block 으로 처리. plan round 1 critic MAJOR #5 + verifier GAP #4 발견 (lib.rs:1235 alias 의 explicit 삭제 의무 명시 누락) 가 round 2 reflection 으로 Step 3 (b) 에 정확히 명시. executor 단계의 build-red intermediate state 가 git history 에 안 들어감 (single clean commit).

  4. 9 AST arm hoist 의 atomic 단위 검증 — lib.rs:2012-2047 의 12 arm (11 explicit + 1 wildcard) → 4 arm (9 AST grouped + 7 manifest grouped + 1 shell + 1 other-bail). plan round 2 의 arm count partial closure (spec §3.7 "13 → 12" 정정) 가 plan + spec 모두 일관. Tier1 fail → Tier3 fallback 의 control flow 보존 trace (anyhow chain 의 outer context 추가가 root cause variant matching 영향 0).

  5. In-crate unit test 의 pub(crate) access closure — plan round 1 critic CRITICAL #1 발견 (integration test 위치 tests/extract_for_dispatch.rs 에서 pub(crate) 필드 접근 불가) 가 Option α (in-crate #[cfg(test)] mod tests_extractor_dispatch in app.rs) 채택으로 closure. 3 test class — registry length 11 / mutually-exclusive supports() grid 16 sample / extract_for "no matching extractor" error path (Audio(Wav) MediaType) — 모두 PASS.

  6. RawAsset 8 field byte-identical — plan round 1 critic MAJOR #2 + verifier Blocker 2 발견 (test fixture 부재 + content_hash/ContentHash 오기 + stored 필드 누락) 가 actual crates/kebab-core/src/asset.rs:63-73 의 8 field (asset_id / source_uri / workspace_path / media_type / byte_len / checksum / discovered_at / stored) byte-identical 로 closure. executor 가 plan line 600 의 delegation 받아 AssetStorage::Copied { path } 정합화.

  7. wire schema 안정성 검증 — Step 11 의 wire diff verify 가 ingest_report.v1 + search_response.v1 byte-identical (jq filter indexed_at / duration_ms / scope.workspace_root strip 후 diff = 0). error.v1.message 의 internal context string wording 변경은 spec §5.5 risk acceptance (error.v1.code + error.v1.schema_version 보존 — user-visible surface 정의 외).

  8. dead polymorphism 해소 (부분) — Extractor trait 의 vtable dispatch 가 처음으로 활용. 11 hardcoded *Extractor::new().extract(...) callsite + 9 AST arm match → 1 app.extract_for(...) callsite + 1 grouped AST arm. trait surface 변경 0 + parser source 변경 0 + 기존 11 Extractor impl 보존. spec §6.5 의 self-aware partial polymorphism 명시 (outer 4-arm + Tier 2/3 + Chunker 는 별 PR future-defer).

  9. frozen contract 보존contract_sections: [] (design 변경 0) + git diff main..HEAD --name-only | grep "^tasks/p" = 0 line (~25 referencing task spec frozen 유지) + wire schema 변경 0 + workspace.version bump 0. minimal-surface refactor 의 cleanest pattern.

  10. v2 → v3 race condition lesson — plan round 2 critic + verifier 의 finding 3건이 false-negative (planner v2 → v3 revision 시점 reviewer spawn race). round 3 grep cross-check 로 v3 actual content 가 이미 closure 확인. future spec/plan revision 의 reviewer spawn 전 latest line count + version snapshot 명시 권장.

추가 actionable 없음. 2 file changed (app.rs +186/-2, lib.rs +16/-43), workspace 1316 tests PASS, release binary clean, wire schema success path 안정. 다음 작업 = 도그푸딩 (사용자 직접 검증 — markdown / pdf / image / code ingest 의 byte-identical 결과) → 도그푸딩 성공 시 gitea-release v0.19.0 진행 (sub-item 2 의 normalize-absorption + 본 sub-item 3 의 extractor-dispatch 합쳐 single release tag).

머지 OK. 머지 후 도그푸딩 → release tag 컷 권장.

회차 1 — Extractor dispatch polymorphism refactor 검토. OMC team `extractor-dispatch-unification` 의 2-round spec APPROVE (planner spec drafter + opus thorough critic round 1 + sonnet closure verify round 2) + 3-round plan ACCEPT (planner → 3 critic-plan + 3 verifier-plan) 의 모든 closure 결과가 본 PR 의 코드에 정확히 reflect. 11 verification gate 모두 green + workspace 1316 tests (baseline 1313 + 3 new in-crate unit tests, delta = +3) + wire schema success path byte-identical. 칭찬 (산문): 1. **Pattern β scope 의 정확한 정렬** — spec round 1 critic 의 MAJOR #2 발견 (Pattern β 의 warning channel handover risk → wire schema diff 0 깨질 위험) 가 Option (ii) (MarkdownExtractor 신설 defer) 채택으로 closure. 본 PR scope 가 "AST 9-arm + image + pdf extract callsite only" 로 정확히 정렬 — markdown path 변경 0 + warning channel handover 회피. 결과적으로 `ingest_report.v1.IngestItem.warnings` byte-identical 유지. 2. **App.extractors registry + extract_for helper 의 single-owner pattern** — Option A 채택 (별 ExtractorRegistry struct 가 아닌 App 의 field). state-less Extractor 11 entry init cost 0 + future plugin system 의 migration path (OnceLock wrapper 또는 Vec element 만 wrapping) 보존. `pub(crate)` visibility 가 facade rule 정합 + in-crate unit test 의 access 가능. 3. **Atomic block 1 의 ordering invariant** — Step 3-4-5-6 의 image dispatch migration (lib.rs:1235 alias 삭제 → lib.rs:356 local 삭제 → ImagePipeline.extractor field 제거 → lib.rs:1289 callsite 교체) 4 edit 가 한 atomic block 으로 처리. plan round 1 critic MAJOR #5 + verifier GAP #4 발견 (lib.rs:1235 alias 의 explicit 삭제 의무 명시 누락) 가 round 2 reflection 으로 Step 3 (b) 에 정확히 명시. executor 단계의 build-red intermediate state 가 git history 에 안 들어감 (single clean commit). 4. **9 AST arm hoist 의 atomic 단위 검증** — lib.rs:2012-2047 의 12 arm (11 explicit + 1 wildcard) → 4 arm (9 AST grouped + 7 manifest grouped + 1 shell + 1 other-bail). plan round 2 의 arm count partial closure (spec §3.7 "13 → 12" 정정) 가 plan + spec 모두 일관. Tier1 fail → Tier3 fallback 의 control flow 보존 trace (anyhow chain 의 outer context 추가가 root cause variant matching 영향 0). 5. **In-crate unit test 의 `pub(crate)` access closure** — plan round 1 critic CRITICAL #1 발견 (integration test 위치 `tests/extract_for_dispatch.rs` 에서 `pub(crate)` 필드 접근 불가) 가 Option α (in-crate `#[cfg(test)] mod tests_extractor_dispatch` in app.rs) 채택으로 closure. 3 test class — registry length 11 / mutually-exclusive supports() grid 16 sample / extract_for "no matching extractor" error path (Audio(Wav) MediaType) — 모두 PASS. 6. **RawAsset 8 field byte-identical** — plan round 1 critic MAJOR #2 + verifier Blocker 2 발견 (test fixture 부재 + content_hash/ContentHash 오기 + stored 필드 누락) 가 actual `crates/kebab-core/src/asset.rs:63-73` 의 8 field (asset_id / source_uri / workspace_path / media_type / byte_len / **checksum** / discovered_at / **stored**) byte-identical 로 closure. executor 가 plan line 600 의 delegation 받아 `AssetStorage::Copied { path }` 정합화. 7. **wire schema 안정성 검증** — Step 11 의 wire diff verify 가 `ingest_report.v1` + `search_response.v1` byte-identical (jq filter `indexed_at` / `duration_ms` / `scope.workspace_root` strip 후 diff = 0). `error.v1.message` 의 internal context string wording 변경은 spec §5.5 risk acceptance (`error.v1.code` + `error.v1.schema_version` 보존 — user-visible surface 정의 외). 8. **dead polymorphism 해소 (부분)** — Extractor trait 의 vtable dispatch 가 처음으로 활용. 11 hardcoded `*Extractor::new().extract(...)` callsite + 9 AST arm match → 1 `app.extract_for(...)` callsite + 1 grouped AST arm. trait surface 변경 0 + parser source 변경 0 + 기존 11 Extractor impl 보존. spec §6.5 의 self-aware partial polymorphism 명시 (outer 4-arm + Tier 2/3 + Chunker 는 별 PR future-defer). 9. **frozen contract 보존** — `contract_sections: []` (design 변경 0) + `git diff main..HEAD --name-only | grep "^tasks/p"` = 0 line (~25 referencing task spec frozen 유지) + wire schema 변경 0 + workspace.version bump 0. minimal-surface refactor 의 cleanest pattern. 10. **v2 → v3 race condition lesson** — plan round 2 critic + verifier 의 finding 3건이 false-negative (planner v2 → v3 revision 시점 reviewer spawn race). round 3 grep cross-check 로 v3 actual content 가 이미 closure 확인. future spec/plan revision 의 reviewer spawn 전 latest line count + version snapshot 명시 권장. 추가 actionable 없음. 2 file changed (app.rs +186/-2, lib.rs +16/-43), workspace 1316 tests PASS, release binary clean, wire schema success path 안정. 다음 작업 = 도그푸딩 (사용자 직접 검증 — markdown / pdf / image / code ingest 의 byte-identical 결과) → 도그푸딩 성공 시 `gitea-release v0.19.0` 진행 (sub-item 2 의 normalize-absorption + 본 sub-item 3 의 extractor-dispatch 합쳐 single release tag). 머지 OK. 머지 후 도그푸딩 → release tag 컷 권장.
altair823 merged commit c1e82cca92 into main 2026-05-26 21:07:22 +00:00
altair823 deleted branch refactor/extractor-dispatch-unification 2026-05-26 21:07:23 +00:00
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: altair823-org/kebab#187