Files
kebab/docs/superpowers/plans/2026-05-28-v0.20-ingest-log-plan.md
altair823 685007789a style: cargo fmt --all (round 4 ingest log feature follow-up)
Phase C4 executor 의 마지막 `fix(test): clippy + fmt fixes` commit 이
test file 부분만 fmt 적용. workspace 전체 fmt 누락 발견 → cargo fmt --all
적용. 모든 import alphabetical reorder + line wrapping 정합.

추가 untracked artifact 동시 commit:
- docs/superpowers/specs/2026-05-28-v0.20-ingest-log-spec.md (491 line, ACCEPT)
- docs/superpowers/plans/2026-05-28-v0.20-ingest-log-plan.md (616 line, ACCEPT)

workspace test: 1370 passed / 0 failed / 50 ignored, ingest_log_smoke green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 04:18:40 +00:00

39 KiB
Raw Blame History

title, date, status, phase, target_spec, parent_task, plan_for_version, target_branch, step_count, commit_count, estimated_loc_delta
title date status phase target_spec parent_task plan_for_version target_branch step_count commit_count estimated_loc_delta
v0.20.x ingest log feature — plan 2026-05-28 DRAFT (round 0) B4 (plan drafter) ../specs/2026-05-28-v0.20-ingest-log-spec.md ../../../tasks/p10/p10-1A-5-ingest-failure-log.md 0.20.x feat/pdf-scanned-ocr 6 5 +650 / -25

v0.20.x ingest log feature — plan

§0 Overview

본 plan 은 spec ACCEPT (docs/superpowers/specs/2026-05-28-v0.20-ingest-log-spec.md, 491 line) 의 6 step / 5 commit decomposition. spec §5 AC-1 ~ AC-10 의 acceptance criteria 를 step boundary 마다 verifier로 매핑.

핵심 deliverable:

  1. kebab-config[logging] section (2 field: ingest_log_enabled / ingest_log_dir).
  2. kebab-app/src/ingest_log.rs 신규 module (IngestLogWriter + LogEvent enum, 5 kind).
  3. PdfOcrProgress::Finished + IngestEvent::PdfOcrFinished4 additive field (image_byte_size / image_width / image_height / failure_reason) → wire schema ingest_progress.v1 additive minor cascade.
  4. kebab-app 의 5 emit hook integration (init / flush / OCR / parse_error+skip / fatal error).
  5. integration test ingest_log_smoke.rs (AC-9).
  6. workspace test + clippy + dogfood smoke (AC-8).

작업 분량 estimate: +650 LOC (300 신규 module, 250 hook+config, 100 test) / -25 LOC (PdfOcrProgress callsite refactor). branch 변경 없음, doc-only commit 0.

Spec-driven invariant:

  • wire schema = additive minor (4 optional field 추가, required 변경 없음, 기존 consumer regression 0).
  • backward compat = #[serde(default)] 로 pre-v0.20 config 자동 init (AC-10).
  • subagent skip = direct in-session execution (worker protocol).

§1 Step table

# Step Files (primary) Commit (after step) AC covered
1 LoggingCfg + Config integration crates/kebab-config/src/lib.rs, crates/kebab-config/tests/*.rs feat(config): add [logging] section (ingest_log_enabled + ingest_log_dir) AC-1, AC-10
2 IngestLogWriter module + LogEvent enum crates/kebab-app/src/ingest_log.rs (new), crates/kebab-app/src/lib.rs (mod 선언) feat(app): IngestLogWriter + LogEvent enum (per-ingest-run ndjson log) AC-3 (struct)
3 PdfOcrProgress::Finished extend + wire cascade crates/kebab-app/src/pdf_ocr_apply.rs, crates/kebab-app/src/ingest_progress.rs, docs/wire-schema/v1/ingest_progress.schema.json, integrations/claude-code/kebab/SKILL.md feat(wire): PdfOcrProgress.Finished + ingest_progress.v1 additive 4 fields (image_byte_size/width/height + failure_reason) AC-3 (ocr fields), AC-5 (failure_reason carry)
4 5 emit hook integration crates/kebab-app/src/lib.rs (Hook 1/3/5), crates/kebab-app/src/pdf_ocr_apply.rs (Hook 2 metric capture), crates/kebab-source-fs/src/connector.rs (Hook 4 skip emit) feat(app): wire IngestLogWriter into 5 ingest emit hooks (Arc<Mutex> sync) AC-2, AC-4, AC-5, AC-6, AC-7
5 Integration test (ingest_log_smoke) crates/kebab-app/tests/ingest_log_smoke.rs (new) test(app): ingest_log_smoke integration test (AC-9) AC-9
6 Final sanity (workspace test + clippy + optional dogfood) n/a (verifier only) no commit AC-8

5 commit 단위, 6 step 단위. Step 6 는 verifier-only (no commit), 누적 regression 확인용.


§2 Per-step detail

§2.1 Step 1 — LoggingCfg + Config integration

Goal: spec §3.1 + §4.4 — LoggingCfg struct + Config field + backward compat.

§2.1.1 Files affected

Path Action Approx LOC Notes
crates/kebab-config/src/lib.rs edit +55 / -0 Config struct (line 37+), 새 LoggingCfg struct + Default + default fns
crates/kebab-config/tests/integration.rs (또는 신규 tests/logging_roundtrip.rs) edit / new +35 TOML roundtrip 1 test (default load + override load + pre-v0.20 backward compat)

기존 file crates/kebab-config/src/lib.rs 의 line range:

  • line 3781: Config struct (현재 pdf field 가 line 6266 위치). 신규 logging field 는 pdf 다음 line 67 부근 삽입.
  • 신규 LoggingCfg struct + default_ingest_log_* fn 는 line 416 부근 (PdfCfg::defaults 다음) 또는 file 끝부근의 cfg-grouping spot 에 추가. 위치 선택은 executor 재량 — 기존 cfg struct (NliCfg/PdfOcrCfg/PdfCfg) 와 같은 visual layout 유지.

§2.1.2 Action diff outline

// crates/kebab-config/src/lib.rs (line 37+, Config struct)
pub struct Config {
    // ... existing fields ...
    #[serde(default = "PdfCfg::defaults")]
    pub pdf: PdfCfg,
    /// v0.20.x sub-item: ingest log surface. `#[serde(default)]` 라
    /// pre-v0.20 config (`[logging]` section 부재) 가 default 로 init.
    #[serde(default)]
    pub logging: LoggingCfg,
    #[serde(skip)]
    pub(crate) source_dir: Option<PathBuf>,
}

// 신규 struct (cfg grouping spot)
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]
pub struct LoggingCfg {
    /// ingest 시 structured ndjson log auto-write. default = true.
    /// false 시 log file 생성 0 (AC-6).
    #[serde(default = "default_ingest_log_enabled")]
    pub ingest_log_enabled: bool,

    /// per-ingest-run log file directory. default = `{state_dir}/logs`.
    /// `{state_dir}` placeholder = XDG state dir (e.g. `~/.local/state/kebab`).
    /// Log file 누적 disk usage 는 user-managed (rotation policy 미제공 — spec §6 R-1).
    #[serde(default = "default_ingest_log_dir")]
    pub ingest_log_dir: PathBuf,
}

fn default_ingest_log_enabled() -> bool { true }
fn default_ingest_log_dir() -> PathBuf {
    PathBuf::from("{state_dir}/logs")
}

impl Default for LoggingCfg {
    fn default() -> Self {
        Self {
            ingest_log_enabled: default_ingest_log_enabled(),
            ingest_log_dir: default_ingest_log_dir(),
        }
    }
}

테스트 추가 (crates/kebab-config/tests/logging_roundtrip.rs):

// 1. default Config 의 logging section round-trip (TOML → Config → TOML).
// 2. `[logging]\nenabled = false\ningest_log_dir = "/tmp/x"` override 가
//    deserialize 시 정확히 반영되는지 verify.
// 3. pre-v0.20 fixture (entire config 에 [logging] 없는) 가 default LoggingCfg 로 init (AC-10).

§2.1.3 Acceptance

  • cargo test -p kebab-config -j 4 전수 pass + 새 test 1 pass.
  • cargo build -p kebab-config -j 4 clean.
  • cargo clippy -p kebab-config -- -D warnings 0 warning.

§2.1.4 Commit

feat(config): add [logging] section (ingest_log_enabled + ingest_log_dir)

v0.20.x ingest log surface 의 config side. `LoggingCfg` struct 신설:
  * ingest_log_enabled (bool, default true)
  * ingest_log_dir (PathBuf, default "{state_dir}/logs")

#[serde(default)] tag 로 pre-v0.20 config 가 [logging] section 부재
시 LoggingCfg::default() 자동 init (AC-10 backward compat).

`{state_dir}` placeholder 의 실제 expand 는 step 2 (IngestLogWriter)
의 expand_log_dir helper 가 담당 (kebab-config 의 expand_path_with_base
는 `{state_dir}` 미지원, spec §6 R-3).

§2.2 Step 2 — IngestLogWriter module + LogEvent enum

Goal: spec §4.1 — 새 module crates/kebab-app/src/ingest_log.rsIngestLogWriter + LogEvent enum + IngestSummary struct + run_id generation + path expansion. callsite wiring 은 step 4. 본 step 은 self-contained 한 writer + unit test 만.

§2.2.1 Files affected

Path Action Approx LOC Notes
crates/kebab-app/src/ingest_log.rs new +280 writer struct + LogEvent enum + IngestSummary + Drop impl + 4 unit test
crates/kebab-app/src/lib.rs edit +3 mod ingest_log; 선언 (line 63 부근, pub mod ingest_progress; 다음). pub use ingest_log::{IngestLogWriter, LogEvent, IngestSummary};

§2.2.2 Action diff outline

crates/kebab-app/src/ingest_log.rs (신규 module body):

  • module doc (5 line): writer 역할 + run_id 형식 + emit ordering 명문.
  • imports: std::fs::File, std::io::{BufWriter,Write}, std::path::{Path,PathBuf}, std::time::SystemTime, serde::{Serialize,Deserialize}, time::OffsetDateTime, time::format_description::well_known::Rfc3339, uuid::Uuid.
  • pub struct IngestLogWriter (file: BufWriter<File>, path: PathBuf, run_id: String, started_at: SystemTime).
    • pub fn open(cfg: &kebab_config::LoggingCfg) -> anyhow::Result<Option<Self>>cfg.ingest_log_enabled == falseOk(None), true 시 log_dir 생성 + file create + run_id 발급. open 실패는 Err 반환 (caller 가 swallow + tracing::warn).
    • pub fn write_event(&mut self, event: &LogEvent<'_>) -> anyhow::Result<()> — serde_json::to_writer + writeln.
    • pub fn write_summary(&mut self, summary: &IngestSummary) -> anyhow::Result<()> — 동일 pattern.
    • pub fn flush(&mut self) -> anyhow::Result<()>.
    • getters: run_id() / path() / started_at().
  • impl Drop — best-effort self.file.flush() (spec §6 R-4 panic unwind path).
  • fn generate_run_id() -> StringOffsetDateTime::now_utc().format(time::macros::format_description!("[year][month][day]T[hour][minute][second]Z")) 의 ISO 8601 compact prefix + Uuid::now_v7().simple().to_string() 의 마지막 8 hex char. rand 추가 0 (spec §6 R-5).
  • fn expand_log_dir(path: &Path) -> PathBuf — string-replace {state_dir}kebab_config::Config::xdg_state_dir(). tilde/env 는 kebab_config::expand_path 위임.
  • pub(crate) fn now_ts() -> String — Rfc3339 formatted UTC. step 4 의 hook 들이 호출.
  • pub enum LogEvent<'a>#[serde(tag="kind", rename_all="snake_case")]. 4 variant:
    • Ocr { ts, doc_path, page, image_byte_size: Option<u64>, image_width: Option<u32>, image_height: Option<u32>, ms, chars, success, reason: Option<&'a str>, ocr_engine }.
    • ParseError { ts, doc_path, reason, message }.
    • Skip { ts, doc_path, reason, detail: Option<&'a str> }.
    • Error { ts, code, message }.
  • pub struct IngestSummary — owned fields (ts: String, run_id: String, scanned/new/errors/ocr_pages/ocr_failures: u32, ocr_p50_ms/p90_ms/max_ms: Option<u64>, duration_ms: u64). #[serde(tag = "kind", rename = "summary")] 또는 별도 kind: &'static str = "summary" literal field 로 wire-shape 의 kind: "summary" 강제. 권장: 별도 IngestSummary enum variant 대신 tagged-struct (serde 의 #[serde(rename = "summary")] + explicit kind field) — wire output 의 line 단위 JSON 이 항상 {"kind":"summary",…} 시작.

unit test (5 fn, ingest_log.rs 의 #[cfg(test)] mod tests):

  1. generate_run_id_has_iso_prefix_and_8_hex_suffix^\d{8}T\d{6}Z-[0-9a-f]{8}$ regex match.
  2. expand_log_dir_substitutes_state_dir_placeholder"{state_dir}/logs" → xdg_state_dir + "/logs".
  3. writer_disabled_returns_noneLoggingCfg { enabled: false, .. } → IngestLogWriter::open() == Ok(None).
  4. writer_writes_one_event_per_line_with_kind_discriminator — temp file 에 write_event ×3 → 3 line, 각 line 의 첫 char {, "kind": substring present.
  5. drop_flushes_pending_buffer — write_event 후 explicit flush 없이 drop, 그 후 read_to_string 으로 line count ≥ 1 verify.

OQ-2 (p50 / p90 계산): workspace 에 quantiles crate 없음, simple sorted Vec 으로 처리. 본 step 의 IngestSummary 는 numeric field 만 제공 — 실제 p50/p90 계산은 step 4 의 emit hook 이 ms accumulator (Vec) 유지 후 final stage 에서 sort + percentile index.

OQ-3 (log cleanup policy 명문 위치): LoggingCfg::ingest_log_dir 의 doc-comment 에 한 줄 (Log file 누적 disk usage 는 user-managed) 으로 충분 — README/SMOKE 변경 없음. step 1 의 commit body 에 한 줄 명문.

§2.2.3 Acceptance

  • cargo test -p kebab-app --lib ingest_log -j 4 4 passed.
  • cargo build -p kebab-app -j 4 clean.
  • cargo clippy -p kebab-app -- -D warnings 0 warning.

§2.2.4 Commit

feat(app): IngestLogWriter + LogEvent enum (per-ingest-run ndjson log)

v0.20.x ingest log surface 의 module side. crates/kebab-app/src/
ingest_log.rs 신규:
  * IngestLogWriter — open/write_event/write_summary/flush + Drop flush
  * LogEvent enum 4 variant (ocr / parse_error / skip / error)
  * IngestSummary struct (kind="summary" literal + 11 stat field)
  * generate_run_id (ISO 8601 prefix + uuid v7 마지막 8 hex)
  * expand_log_dir ({state_dir} placeholder 의 hand-roll expand)

uuid v7 = workspace dep (Cargo.toml line 132), rand 신규 의존 회피
(spec §6 R-5).

본 step 은 self-contained writer + 5 unit test. ingest pipeline 의
emit hook 5개 wiring 은 step 4.

§2.3 Step 3 — PdfOcrProgress::Finished extend + wire cascade

Goal: spec §4.2 HIGH-1 + §3.3 ocr fields — PdfOcrProgress::FinishedIngestEvent::PdfOcrFinished 에 4 additive field 추가, wire schema additive minor cascade.

§2.3.1 Files affected

Path Action Approx LOC Notes
crates/kebab-app/src/pdf_ocr_apply.rs edit +25 / -5 PdfOcrProgress::Finished variant 의 4 field 추가, 3 emit_progress callsite (line 145, 173, 247) 의 measurement + emit 갱신
crates/kebab-app/src/ingest_progress.rs edit +15 / -5 IngestEvent::PdfOcrFinished 의 4 field 추가, 기존 test ingest_event_serializes_with_discriminator 류 보존
crates/kebab-app/src/lib.rs edit +10 / -3 line 18651882 의 PdfOcrProgress::Finished { … } => IngestEvent::PdfOcrFinished { … } mapping 의 4 field carry
docs/wire-schema/v1/ingest_progress.schema.json edit +12 image_byte_size / image_width / image_height / failure_reason property 추가 (모두 optional, required 변경 없음)
integrations/claude-code/kebab/SKILL.md edit +5 wire schema description 동기 (추가 optional field 명시)

§2.3.2 Action diff outline

PdfOcrProgress::Finished (pdf_ocr_apply.rs line 283):

Finished {
    page: u32,
    ms: u64,
    chars: u32,
    skipped: bool,
    // NEW (4 field, optional):
    image_byte_size: Option<u64>,
    image_width: Option<u32>,
    image_height: Option<u32>,
    failure_reason: Option<String>,  // "timeout" | "ocr_error" | "network_error" | None
},

3 emit_progress callsite 갱신 (pdf_ocr_apply.rs line 145 / 173 / 247):

  • line 145 (success path, OCR 정상 완료): image_byte_size: Some(<image bytes>), image_width: Some(<w>), image_height: Some(<h>), failure_reason: None. <image bytes/w/h> 는 raster image 의 measurement (Bug #11 follow-up 의 측정 spot 재사용 또는 인접 변수).
  • line 173 (engine 실패 → skip): failure_reason: Some("ocr_error".into()) (or "timeout" if 분류 가능). image metric 은 available 시 emit, unavailable 시 None.
  • line 247 (validation/threshold skip, OCR 미수행): failure_reason: None, image metric 가능 시 emit.

IngestEvent::PdfOcrFinished (ingest_progress.rs line 96102):

PdfOcrFinished {
    page: u32,
    ms: u64,
    chars: u32,
    ocr_engine: String,
    skipped: bool,
    // NEW (4 field, optional):
    image_byte_size: Option<u64>,
    image_width: Option<u32>,
    image_height: Option<u32>,
    failure_reason: Option<String>,
},

crates/kebab-app/src/lib.rs line 18651882 의 mapping:

crate::pdf_ocr_apply::PdfOcrProgress::Finished {
    page, ms, chars, skipped,
    image_byte_size, image_width, image_height, failure_reason,
} => {
    if let Some(sender) = progress {
        let _ = sender.send(
            crate::ingest_progress::IngestEvent::PdfOcrFinished {
                page, ms, chars,
                ocr_engine: engine.engine_name().to_string(),
                skipped,
                image_byte_size, image_width, image_height,
                failure_reason: failure_reason.clone(),
            },
        );
    }
    // step 4 의 Hook 2 가 이 위치에서 추가로 log writer 에 write.
}

docs/wire-schema/v1/ingest_progress.schema.jsonproperties (line 8+) 에 추가:

"image_byte_size": { "type": "integer", "minimum": 0, "description": "pdf_ocr_finished (optional): raster image byte size." },
"image_width":     { "type": "integer", "minimum": 0, "description": "pdf_ocr_finished (optional): raster image width px." },
"image_height":    { "type": "integer", "minimum": 0, "description": "pdf_ocr_finished (optional): raster image height px." },
"failure_reason":  { "type": "string", "enum": ["timeout", "ocr_error", "network_error", "other"], "description": "pdf_ocr_finished (optional): present iff OCR failed." }

중요: required array 는 변경 없음 (현재 ["schema_version", "kind", "ts"]). 4 field 모두 optional → additive minor = backward compat.

integrations/claude-code/kebab/SKILL.md 갱신 — wire schema 의 pdf_ocr_finished 설명에 4 추가 field 한 줄 명문 (existing 1 paragraph 다음).

§2.3.3 Acceptance

  • cargo test -p kebab-app pdf_ocr_apply -j 4 전수 pass.
  • cargo test -p kebab-app ingest_progress -j 4 전수 pass.
  • cargo test -p kebab-cli wire_search wire_ask -j 4 regression check (기존 PdfOcrFinished consumer 가 4 추가 field 의 Option::None 으로도 deserialize 성공).
  • cargo build -p kebab-app -p kebab-cli -j 4 clean.
  • wire schema validate: jq '.properties | keys' docs/wire-schema/v1/ingest_progress.schema.json 가 4 신규 key 포함, .required 변경 없음.

§2.3.4 Commit

feat(wire): PdfOcrProgress.Finished + ingest_progress.v1 additive 4 fields

v0.20.x ingest log feature 의 wire side. additive minor cascade:

  * PdfOcrProgress::Finished + IngestEvent::PdfOcrFinished 의 4 field:
      - image_byte_size: Option<u64>
      - image_width:     Option<u32>
      - image_height:    Option<u32>
      - failure_reason:  Option<String>
  * docs/wire-schema/v1/ingest_progress.schema.json — 4 추가 property
    (모두 optional, required 변경 없음 = additive minor)
  * integrations/claude-code/kebab/SKILL.md — wire schema description 동기

기존 ingest_progress.v1 consumer (CLI wire dump, integration test
fixture, kebab-cli wire_search/wire_ask) 는 4 추가 field 의
Option::None 으로 backward-compat. version bump 0 (additive minor =
binary-version cascade trigger 아님 per CLAUDE.md §Versioning cascade).

§2.4 Step 4 — 5 emit hook integration (Arc<Mutex>)

Goal: spec §4.2 — 5 hook 위치에서 IngestLogWriter 호출. ownership = Option<Arc<Mutex<IngestLogWriter>>> (binding 은 ingest_with_config_opts 에서, 5 hook 이 clone+lock+write).

§2.4.1 Files affected

Path Action Approx LOC Hook
crates/kebab-app/src/lib.rs edit +110 / -10 Hook 1 (init + flush), Hook 3 (parse_error path), Hook 5 (fatal error), summary stage 의 percentile 계산 + write_summary
crates/kebab-app/src/pdf_ocr_apply.rs edit +35 / -3 Hook 2: image metric capture (existing raster decode spot) + emit_progress 의 4 field carry. signature 변경 없음Finished field 추가만으로 caller 가 image metric carry 가능
crates/kebab-source-fs/src/connector.rs edit +25 Hook 4: scan_with_skips 의 skip event 마다 callback (또는 FsScanSkipsevents: Vec<SkipEvent> accumulator) — kebab-app 이 scan 후 enumerate + write

§2.4.2 Hook detail

Hook 1 — ingest_with_config_opts (lib.rs line 281):

function entry 직후 let log_writer: Option<Arc<Mutex<IngestLogWriter>>> = IngestLogWriter::open(&config.logging) | Ok(Some(w)) → Some(Arc::new(Mutex::new(w))) | Ok(None) → None | Err(e) → tracing::warn + None. function exit (Completed / Aborted 경로 직전) 에서 summary 계산 + write_summary + flush. summary 의 ocr_p50_ms / p90_ms / max_ms 는 success-only OCR duration accumulator Vec<u64>sort_unstable()len*50/100 / len*90/100 index 로 추출, samples.last() 로 max.

Hook 2 — apply_ocr_to_pdf_pages (pdf_ocr_apply.rs) → caller closure in lib.rs line 1855:

step 3 에서 PdfOcrProgress::Finished 의 4 field 추가됐으므로 본 step 은 closure 의 Finished arm 에 한 줄 추가: log_writer.clone() 캡처 + lock + write_event(&LogEvent::Ocr { ts: now_ts(), doc_path, page, image_*, ms, chars, success: !skipped && failure_reason.is_none(), reason: failure_reason.as_deref(), ocr_engine: engine.engine_name() }). success path 시 ocr_ms_samples.lock().push(ms).

ownership note (MEDIUM-1): emit_progress 는 F: FnMut(PdfOcrProgress) (pdf_ocr_apply.rs line 88) → closure 가 Arc<Mutex<_>> clone 캡처 가능. single-threaded per-asset loop 이므로 deadlock 위험 없음.

Hook 3 — parse_error (lib.rs ingest_one_pdf_asset line 1770 + ingest_one_code_asset line 2002 의 parse Err arm):

kebab_parse_pdf::extract(...) (또는 code parser) 의 Err(e) arm 마다 한 줄: log_writer.lock().write_event(&LogEvent::ParseError { ts: now_ts(), doc_path: asset.path_str(), reason: classify_parse_error(&e), message: &format!("{e}") }). classify_parse_errorkebab_core::Error::PdfFormat → "lopdf_error", Error::ImageFormat → "image_format", fallback "other" 분류 — pdf_ocr_apply.rs 또는 ingest_log.rs 의 helper.

Hook 4 — skip event (kebab-source-fs/src/connector.rs):

current FsSourceConnector::scan_with_skips (line 100) 은 skip 마다 tracing::debug + counter increment 만 함. 두 option — A (FsScanSkipsevents: Vec<FsSkipEvent> field 추가) vs B (connector 에 Arc<Mutex<IngestLogWriter>> 주입). A 채택 (B 는 kebab-source-fs → kebab-app cycle).

FsScanSkips (line 207 부근) 에 pub events: Vec<FsSkipEvent> 추가, 새 struct FsSkipEvent { doc_path: String, reason: &'static str, detail: Option<String> } 정의. 5 skip arm (line 113 builtin_blacklist / 122 gitignore / 131 kebabignore / 154 generated / 179 size_exceeded) 마다 fs_skips.events.push(FsSkipEvent { ... }) 추가. kebab-app/lib.rs 가 scan 직후 (asset loop 진입 전) for ev in &fs_skips.events { log_writer.lock().write_event(&LogEvent::Skip { ts: now_ts(), doc_path: &ev.doc_path, reason: ev.reason, detail: ev.detail.as_deref() }) } enumerate.

Hook 5 — fatal error (lib.rs ingest_with_config_opts 의 error return path):

? operator bubbling 패턴이므로 explicit catch spot 부재. 권장 위치: ingest_with_config_opts body 전체를 inner closure (|| -> anyhow::Result<IngestReport> { ... })() 로 wrap 후 outer 에서 match result { Err(e) => { log_writer.lock().write_event(&LogEvent::Error { ts: now_ts(), code: "ingest_fatal", message: &format!("{e:#}") }); flush; Err(e) }, Ok(r) => { write_summary + flush; Ok(r) } }. 본 패턴은 기존 ingest_progress 의 Completed / Aborted emit 과 mutually exclusive — Aborted 는 cancel 의 정상 종료 (not Err), Err arm 만 LogEvent::Error 발동. spec §4.2 의 "error_wire::classify 자체 변경 0" 와 정합 — classify 는 kebab-cli wire.rs 에서 호출, 본 hook 는 facade 안 generic 처리.

§2.4.3 ownership wiring 요약

ingest_with_config_opts:
  log_writer: Option<Arc<Mutex<IngestLogWriter>>>
    ├─ Hook 1: init at entry + write_summary at exit
    ├─ apply_ocr_to_pdf_pages closure:
    │     log_writer.clone() 캡처 → Hook 2 write_event(LogEvent::Ocr)
    │     ocr_ms_samples.clone() 캡처 → success-only ms push
    ├─ ingest_one_pdf_asset / _code_asset 의 parse Err arm: Hook 3 write_event(LogEvent::ParseError)
    ├─ scan 직후 fs_skips.events enumerate: Hook 4 write_event(LogEvent::Skip)
    └─ error_wire::classify 호출 spot: Hook 5 write_event(LogEvent::Error)

§2.4.4 Acceptance

  • cargo test --workspace -j 1 --no-fail-fast 전수 pass (기존 1358 test + 어떤 새 test 도 regression 0).
  • cargo build --workspace -j 4 clean.
  • cargo clippy --workspace --all-targets -j 4 -- -D warnings 0 warning.
  • 본 step 내 새 module-level test 없음 — integration test 는 step 5.

§2.4.5 Commit

feat(app): wire IngestLogWriter into 5 ingest emit hooks (Arc<Mutex> sync)

v0.20.x ingest log feature 의 ingest pipeline wiring. 5 emit hook:

  Hook 1: ingest_with_config_opts entry/exit (writer init + summary write + flush)
  Hook 2: apply_ocr_to_pdf_pages closure (PdfOcrProgress::Finished → LogEvent::Ocr)
  Hook 3: ingest_one_*_asset parse Err arm (LogEvent::ParseError)
  Hook 4: scan 직후 fs_skips.events enumerate (LogEvent::Skip)
  Hook 5: error_wire::classify 호출 spot (LogEvent::Error)

Hook 4 의 skip event carry 위해 kebab-source-fs 의 FsScanSkips 에
events: Vec<FsSkipEvent> field 추가 (kebab-source-fs 가 kebab-app
재호출 안 함 — cycle 회피).

Ownership: Option<Arc<Mutex<IngestLogWriter>>> binding 1 곳, 5 hook 이
clone+lock+write. ocr_ms_samples (Vec<u64> success-only) 는 Arc<Mutex>
로 share, summary stage 가 sort+p50/p90/max 계산. single-threaded
per-asset loop 라 deadlock/contention 위험 없음.

Writer 실패는 ingest 자체 fail 시키지 않음 (tracing::warn + 진행).

§2.5 Step 5 — Integration test ingest_log_smoke

Goal: spec §5 AC-9 — 5-step body integration test.

§2.5.1 Files affected

Path Action Approx LOC Notes
crates/kebab-app/tests/ingest_log_smoke.rs new +160 1 fn ingest_log_smoke + 1 supporting fn (minimal corpus generator)
crates/kebab-app/Cargo.toml edit (optional) +0 / +0 tempfile 가 이미 dev-dep 이면 변경 0. (crates/kebab-app/tests/ 의 기존 test 가 사용 중인지 verify — 거의 확실)

§2.5.2 Action diff outline

crates/kebab-app/tests/ingest_log_smoke.rs 신규, 2 #[test]:

fn ingest_log_smoke (AC-9, 6-step body):

  1. TempDir::new() + workspace tmp/kb + log_dir tmp/logs 생성.
  2. minimal corpus — kb/hello.md (plain text) + kb/scanned.pdf (fixture tests/fixtures/scanned-1page.pdf copy; fallback fixture 결정은 §5 OQ-A).
  3. Config::test_default(&workspace)cfg.logging = LoggingCfg { ingest_log_enabled: true, ingest_log_dir: log_dir }.
  4. ingest_with_config_opts(cfg, SourceScope::Workspace, false, IngestOpts::default()).expect("ingest").
  5. read_dir(&log_dir)ingest-*.ndjson 정확히 1 file assert. read_to_string 으로 body.
  6. body.lines() 각 line → serde_json::from_str 으로 parse, kind field ∈ {"ocr","parse_error","skip","error","summary"} assert (matches! macro). 마지막 line kind == "summary", scanned > 0, ocr_pages > 0 assert.

fn ingest_log_disabled_emits_no_file (AC-6, 4-step body):

  1. TempDir + workspace + hello.md 만.
  2. cfg.logging = LoggingCfg { ingest_log_enabled: false, .. }.
  3. ingest_with_config_opts 실행.
  4. log_diringest-*.ndjson 파일 0개 assert (log_dir 자체 생성됐을 수 있으나 file 0).

imports: tempfile::TempDir, kebab_app::{ingest_with_config_opts, IngestOpts, SourceScope}, kebab_config::{Config, LoggingCfg}, serde_json::Value.

Fixture fallback: tests/fixtures/scanned-1page.pdf 가 미존재 시 (likely — 본 PR scope 가 아니어서) 기존 PDF fixture (e.g. tests/fixtures/*.pdf) 중 1 page 의 raster-only 가 있으면 그것을, 없으면 plain text PDF + skip ocr 사례로 test scope 축소 (ocr_pages > 0 대신 summary kind 만 verify).

→ executor 가 fixture 위치 확인 후 결정. 본 plan 은 scanned-1page.pdf 를 가정.

§2.5.3 Acceptance

  • cargo test -p kebab-app --test ingest_log_smoke -j 4 2>&1 | tail -31 passed; 0 failed.
  • cargo test -p kebab-app --test ingest_log_smoke ingest_log_disabled_emits_no_file -j 41 passed; 0 failed.

§2.5.4 Commit

test(app): ingest_log_smoke integration test (AC-9)

crates/kebab-app/tests/ingest_log_smoke.rs 신규:

  * ingest_log_smoke (AC-9): tempdir + 1 md + 1 scanned PDF →
    ingest → assert log file exists + 각 line valid JSON +
    각 kind ∈ {ocr,parse_error,skip,error,summary} + last
    line kind=summary + scanned>0 + ocr_pages>0.

  * ingest_log_disabled_emits_no_file (AC-6): enabled=false 일
    때 log_dir 안 ingest-*.ndjson 파일 0개 verify.

fixture: tests/fixtures/scanned-1page.pdf (executor 가 기존
fb-* PR 시 추가했던 scanned PDF fixture 재사용; 미존재 시
fallback path — fixture 추가 commit 별도 prepend).

§2.6 Step 6 — Final sanity (no commit)

Goal: 누적 workspace test + clippy + (optional) dogfood.

§2.6.1 Verifier

  • workspace test 전수: CARGO_TARGET_DIR=/build/out/cargo-target/target cargo test --workspace --no-fail-fast -j 1 2>&1 | tail -20test result: ok.
  • clippy: cargo clippy --workspace --all-targets -j 4 -- -D warnings 2>&1 | tail -10 → exit 0.
  • format: cargo fmt --all --check → exit 0.
  • (optional) dogfood smoke: target/release/kebab ingest --config /tmp/kebab-smoke/config.toml --json 2>/dev/null | tail -3 → success + ls /tmp/kebab-smoke/logs/ingest-*.ndjson | wc -l ≥ 1.

§2.6.2 Commit

본 step 은 commit 0. regression detected 시 step 15 중 해당 step 으로 돌아가 fix → git commit --amend 또는 git commit --fixup (CLAUDE.md §Git Hygiene: "create NEW commits rather than amending" — fixup 권장).


§3 Verifier checklist (cumulative)

spec §5 AC 마다 step 매핑 + verifier command. 본 plan 의 executor 가 step 종료 시마다 누적 verifier 실행:

AC Spec text 요약 Verifier Step
AC-1 [logging] default emit TOML serialize 시 [logging] block 자동 추가 (`Config::default() toml::to_string`)
AC-2 ingest-{run_id}.ndjson 파일 생성 ls {log_dir}/ingest-*.ndjson ≥ 1 (smoke test) Step 5 (smoke 안 검증)
AC-3 각 line valid JSON + kind enum jq -c 'select(.kind | IN("ocr","parse_error","skip","error","summary"))' < log.ndjson | wc -l = line count Step 5
AC-4 OCR per-page + summary record grep -c '"kind":"ocr"' < log.ndjson ≥ 1 + last line kind=summary Step 5
AC-5 모든 failure type record (size_exceeded / parse_error / ocr timeout) smoke test 의 fixture 가 1개 size_exceeded 또는 ocr-fail 를 trigger 시 grep Step 5 (optional fixture 확장)
AC-6 ingest_log_enabled = false → 파일 0 ingest_log_disabled_emits_no_file test Step 5
AC-7 ingest_log_dir override → custom path emit smoke test 의 tempdir 가 그 검증 (default 가 아닌 path 에 file 생성) Step 5
AC-8 workspace test + clippy cargo test --workspace -j 1 + cargo clippy --workspace --all-targets -- -D warnings Step 6
AC-9 integration test cargo test -p kebab-app --test ingest_log_smoke -j 4 Step 5
AC-10 pre-v0.20 config (no [logging]) load with defaults Step 1 의 새 test 가 fixture toml 의 [logging] 부재 → Config::load 후 logging == LoggingCfg::default() Step 1

누적 invariant:

  • step 1 종료 후: AC-1, AC-10.
  • step 2 종료 후: AC-1, AC-10 (writer struct unit test 만).
  • step 3 종료 후: 동일 + wire schema additive verified (consumer regression 0).
  • step 4 종료 후: 동일 + workspace test regression 0.
  • step 5 종료 후: AC-1, AC-2, AC-3, AC-4, AC-6, AC-7, AC-9, AC-10. AC-5 는 fixture coverage 에 따라.
  • step 6 종료 후: AC-8 + 전체 cumulative.

§4 Risks resolution

spec §6 R-1 ~ R-5 + OQ-1 ~ OQ-3 의 plan resolution:

  • R-1 (log rotation cleanup): step 1 의 LoggingCfg::ingest_log_dir doc-comment 에 Log file 누적 disk usage 는 user-managed 한 줄. README/SMOKE/ARCHITECTURE 변경 0 (user-facing surface 가 config field 자체이고 일반 user 가 default 로 만족).
  • R-2 (concurrent ingest run_id collision): step 2 의 generate_run_id = ISO 8601 second-precision prefix + uuid v7 마지막 8 hex. uuid v7 은 ms precision + 74-bit random, 8 hex (32 bit) 도 동일 ms 안 collision 확률 1e-9 미만. concurrent ingest 가 의도된 use case 아님 (single-user local-first KB) 이라 mitigate 충분.
  • R-3 ({state_dir} placeholder expand): step 2 의 expand_log_dir 가 hand-roll string-replace. existing kebab_config::expand_path 는 tilde/env 만 처리, {state_dir} 미지원. follow-up: expand_path_with_base{state_dir} 도 추가하는 일반화는 본 PR scope 아님 (LOW-2 deferred).
  • R-4 (panic/abort 시 flush 미실행): step 2 의 Drop for IngestLogWriterlet _ = self.file.flush() — panic unwind 도 BufWriter::drop 이 flush 시도 (kernel write call). abort (libc::abort, SIGKILL) 는 drop 미실행 — 본 case 는 mitigate 불가 (OS-level limitation).
  • R-5 (rand 신규 의존 회피): step 2 의 generate_run_id 가 uuid::Uuid::now_v7().simple().to_string() 의 마지막 8 hex 사용. uuid v7 는 workspace dep, rand 추가 0.

OQ:

  • OQ-1 (image_byte_size + dimensions 출처): spec ACCEPT 이 PdfOcrProgress::Finished carry (Option A) 채택. step 3 가 이 patch 의 wire cascade.
  • OQ-2 (p50 / p90 계산): step 4 의 summary stage 가 success-only Vec<u64> sort + index len*50/100 (truncating). quantiles crate 추가 0.
  • OQ-3 (log cleanup doc 위치): step 1 의 LoggingCfg::ingest_log_dir doc-comment 만 — README/SMOKE 변경 0. 만약 user-facing 명문이 필요해지면 follow-up commit 으로 README 의 Configuration section 에 1 줄.

추가 OQ (closure r2 LOW-3 의 spec line 22 vs 414 inconsistency):

  • OQ-4: spec line 22 의 "wire schema 변경 0" 와 line 414 의 "additive minor (backward compat)" 가 의미상 동등 (additive minor = wire-schema major bump 미발생 = "변경 0" 의 의도). step 3 의 commit body 가 이 명문화 — additive minor = binary-version cascade trigger 아님 (CLAUDE.md §Versioning cascade 의 "wire 의 additive minor 변경 (...) 은 backward-compat 이라 본 트리거에 해당 안 됨" 와 일치). spec body 자체의 1-line 수정은 별도 prepend commit 또는 본 step 3 commit body 의 명문화로 충분 (executor 재량).

§5 Open questions for executor

executor (Phase C round 0) 가 결정해야 할 in-step open question:

  • OQ-A (fixture availability): crates/kebab-app/tests/fixtures/scanned-1page.pdf 존재 여부 확인. 미존재 시 (a) 기존 fixture 재사용 (e.g. fb-04 의 PDF) — fixture path 만 수정, (b) plain text PDF 로 test scope 축소, (c) 신규 fixture 추가 commit prepend. 권장: (a) 또는 (b). 추가 fixture commit 은 5-commit 분량 초과.
  • OQ-B (Hook 5 위치 정밀화): spec §4.2 Hook 5 가 "ingest_with_config_opts 의 error return path (per-asset catch + final Err arm)" 라고 명시. 실제 lib.rs 에는 명시적 match err { ... } 패턴 부재 — ? operator chain 으로 bubble. executor 가 error_wire::classify 호출 자체 찾아서 그 직전 spot 에 한 줄 추가. classify 호출 위치는 현재 crates/kebab-app/src/error_wire.rs 혹은 그 caller (kebab-cli wire.rs). 본 plan 은 kebab-app facade 안에서 classify 호출이 발생한다고 가정 — 만약 classify 가 kebab-cli 에서만 호출되면 Hook 5 가 spec 의 "writer 생명주기" 와 mismatch (writer 는 kebab-app 안). 이때는 kebab-app facade 의 final Err arm 에서 let _ = log_writer.lock().map(|mut w| w.write_event(LogEvent::Error { code: "ingest_fatal", message: &format!("{e}") })) 식 generic 처리. executor 가 grep 후 결정.
  • OQ-C (OCR ms accumulator share pattern): closure 가 FnMut 라면 RefCell<Vec<u64>> 충분, FnOnce/Fn 라면 Arc<Mutex<Vec<u64>>>. emit_progress 가 FnMut 로 보임 (line 88 F: FnMut(PdfOcrProgress)) → RefCell 도 가능하나 본 plan 은 lock writer 와 같은 pattern (Arc<Mutex<_>>) 으로 일관성.
  • OQ-D (skip event 누락 case): FsScanSkips.events 가 5 skip arm 중 어느 한 곳이라도 누락되면 AC-5 가 fail. executor 가 connector.rs 의 5 skip spot (builtin_blacklist / gitignore / kebabignore / generated / size_exceeded) 모두 push 추가 verify.

§6 References

  • Spec: docs/superpowers/specs/2026-05-28-v0.20-ingest-log-spec.md (491 line, ACCEPT 7/7 + 1 LOW)
  • Closure critic r2: .omc/reviews/2026-05-28-v0.20-ingest-log-spec-closure-r2-result.md
  • Brief: .omc/reviews/2026-05-28-v0.20-ingest-log-plan-drafter-brief.md
  • Parent task: tasks/p10/p10-1A-5-ingest-failure-log.md
  • Parent design: docs/superpowers/specs/2026-04-27-kebab-final-form-design.md §8 (wire schema), §9 (versioning cascade)
  • Bug #11 follow-up: OCR raster image metric capture (pdf_ocr_apply.rs line 145 vicinity)
  • Existing wire schema: docs/wire-schema/v1/ingest_progress.schema.json (57 line)
  • Existing IngestEvent: crates/kebab-app/src/ingest_progress.rs line 61103
  • Existing PdfOcrProgress: crates/kebab-app/src/pdf_ocr_apply.rs line 276294
  • Existing fs skip detection: crates/kebab-source-fs/src/connector.rs::scan_with_skips (line 100 부근, 5 skip arm)
  • xdg_state_dir: crates/kebab-config/src/lib.rs line 1112
  • uuid v7 workspace dep: Cargo.toml line 132 (uuid = { version = "1", features = ["v7", "serde"] })
  • time crate workspace dep: Cargo.toml line 131 (time = { version = "0.3", features = ["serde", "macros", "formatting", "parsing"] })

§7 Constraints (worker protocol + spec)

  1. branch 변경 0 — 모든 commit 은 feat/pdf-scanned-ocr HEAD (6a9551e) 의 직계 descendant. PR 은 main 으로.
  2. subagent skip — executor 가 nested subagent spawn 안 함, in-session direct edit.
  3. spec ACCEPT frozen 변경 0 — spec body 의 1-line LOW-3 fix (line 22 ↔ 414 정합화) 는 별도 spec-edit commit (필요 시 본 PR 이외).
  4. wire schema = additive minoringest_progress.v1 의 4 추가 field 가 모두 optional, required array 변경 0. 기존 consumer (kebab-cli wire_search / wire_ask / Claude Code skill) regression 0.
  5. regression 0 — 기존 1358 workspace test + 새 +6 test (config roundtrip 1, ingest_log unit 5, integration 2). cumulative cargo test --workspace -j 1 전수 pass.
  6. commit 단위 = 5 — spec acceptance scope 의 commit boundary. step 6 는 verifier-only, no commit.
  7. plan line 500700 — 본 file 약 670 line target.
  8. dogfood 영향 0 — 본 plan 의 commit 들이 mass-merged 후 dogfood smoke 가 fail 시 사용자 보고 + revert. dogfood = docs/SMOKE.md 의 isolated TempDir KB pipeline.
  9. binary version bump 0 — wire schema additive minor + design contract 변경 0 → CLAUDE.md §Versioning cascade 의 bump trigger 미발동 (현재 0.19.x 가정, executor 가 workspace Cargo.toml version 확인).
  10. HANDOFF/ARCHITECTURE 변경 0 — 사용자 surface (CLI flag, TUI key, config field 사용자 노출) 변경이 config 1개 (logging) 뿐 → README 의 Configuration section 에 한 줄 (feedback_readme_sync_rule 의 "사용자 visible surface 변경 시" 가 강하게 trigger). step 1 또는 step 4 commit 에서 README 한 줄 추가 (option) — line 700 분량 절감 위해 본 plan 은 명문 0, executor 재량.

§8 Plan-level estimate

  • drafter (current task): 30 min — read brief + spec + 8 source spot grep + write plan.
  • executor (next phase): 4-6 h — step 1 (30 min) + step 2 (90 min) + step 3 (60 min) + step 4 (120 min) + step 5 (90 min) + step 6 (30 min) + commit drafting + dogfood smoke.
  • review (final phase): 30-60 min — 5 commit diff scan + AC verifier reproduce + dogfood log file 1 spot check.

총 5-7 h end-to-end, 본 PR 만으로 dogfood 사용자 (Phase B4 → B4-execute → review) 완료 가능.