review(p6-2): 회차 3 cosmetic — build() 회귀 테스트 + lib doc trust note

- src/ocr.rs:
  • `OllamaVisionOcr` 에 `#[derive(Debug)]` 추가 (test 의 expect_err
    바운드 충족용; reqwest::blocking::Client 도 Debug 구현).
  • 신규 unit 테스트 3건 (`build_rejects_empty_endpoint`,
    `build_rejects_empty_model_after_trim`,
    `build_clamps_max_pixels_outside_legal_range`) — 회차 2 에서
    추가된 `fn build` 가드의 회귀 신호.
- src/lib.rs:
  • 모듈-레벨 doc-comment 에 OCR 트러스트 정책 한 줄 추가
    (\"LLM-driven default can hallucinate; OcrText.engine carries
    source identity\"). lib 사용자가 ocr 모듈 doc 까지 안 들어가도
    의도 캐치 가능.

cargo test -p kebab-parse-image — 31 pass + 1 ignored
  (11 unit + 12 P6-1 integration + 8 P6-2 integration).
cargo clippy -p kebab-parse-image --all-targets -- -D warnings — pass.
This commit is contained in:
2026-05-02 05:51:00 +00:00
parent 2bede0030f
commit 1539367692
2 changed files with 45 additions and 1 deletions

View File

@@ -8,7 +8,10 @@
//! P6-2 adds the [`ocr`] module: an [`OcrEngine`] trait and an
//! [`OllamaVisionOcr`] default adapter that talks to a vision-capable
//! Ollama model. [`apply_ocr`] is the helper that mutates an
//! [`ImageRefBlock`] in place.
//! [`ImageRefBlock`] in place. Trust note — the LLM-driven default
//! can hallucinate; `OcrText.engine` carries the source identity so
//! consumers can branch trust by engine (Tesseract / Apple Vision
//! adapters, when added, will write a different `engine` string).
//!
//! Per design §3.4 (Block::ImageRef + ImageRefBlock), §3.7a (OcrText /
//! ModelCaption stubs), §9.1 (image extraction policy / OCR vs caption

View File

@@ -116,6 +116,7 @@ pub fn apply_ocr(
/// Ollama-vision OCR adapter — POSTs the image (base64) to
/// `<endpoint>/api/generate` with a transcription prompt and reads the
/// non-streaming response.
#[derive(Debug)]
pub struct OllamaVisionOcr {
client: reqwest::blocking::Client,
endpoint: String,
@@ -458,4 +459,44 @@ mod tests {
let p = engine.build_prompt(Some(&Lang("und".into())));
assert!(!p.contains("hint:"));
}
/// `from_parts` (and by extension `new`) must reject an empty
/// endpoint string. Pinned so the bail message stays grep-able and
/// the constructor cannot drift to "silently accept a bad config".
#[test]
fn build_rejects_empty_endpoint() {
let r = OllamaVisionOcr::from_parts("", "m", vec![], 1024);
let err = r.expect_err("empty endpoint must bail").to_string();
assert!(
err.contains("endpoint is empty"),
"bail message missing 'endpoint is empty': {err}"
);
}
/// Whitespace-only model id trims to empty and must be rejected —
/// both `new` and `from_parts` route through the shared `build`,
/// so testing `from_parts` covers both.
#[test]
fn build_rejects_empty_model_after_trim() {
let r = OllamaVisionOcr::from_parts("http://x", " ", vec![], 1024);
let err = r.expect_err("empty model must bail").to_string();
assert!(
err.contains("model is empty"),
"bail message missing 'model is empty': {err}"
);
}
/// Out-of-range `max_pixels` is silently clamped (not rejected) so
/// a bad config can't kill ingest. The accessor exposes the clamped
/// value so tests can verify the bound; the warning side-effect is
/// tested implicitly (no panic, no error).
#[test]
fn build_clamps_max_pixels_outside_legal_range() {
let too_small =
OllamaVisionOcr::from_parts("http://x", "m", vec![], 1).unwrap();
assert_eq!(too_small.max_pixels(), MIN_LONG_EDGE);
let too_big =
OllamaVisionOcr::from_parts("http://x", "m", vec![], u32::MAX).unwrap();
assert_eq!(too_big.max_pixels(), MAX_LONG_EDGE);
}
}