review(p6-2): 회차 3 cosmetic — build() 회귀 테스트 + lib doc trust note
- src/ocr.rs:
• `OllamaVisionOcr` 에 `#[derive(Debug)]` 추가 (test 의 expect_err
바운드 충족용; reqwest::blocking::Client 도 Debug 구현).
• 신규 unit 테스트 3건 (`build_rejects_empty_endpoint`,
`build_rejects_empty_model_after_trim`,
`build_clamps_max_pixels_outside_legal_range`) — 회차 2 에서
추가된 `fn build` 가드의 회귀 신호.
- src/lib.rs:
• 모듈-레벨 doc-comment 에 OCR 트러스트 정책 한 줄 추가
(\"LLM-driven default can hallucinate; OcrText.engine carries
source identity\"). lib 사용자가 ocr 모듈 doc 까지 안 들어가도
의도 캐치 가능.
cargo test -p kebab-parse-image — 31 pass + 1 ignored
(11 unit + 12 P6-1 integration + 8 P6-2 integration).
cargo clippy -p kebab-parse-image --all-targets -- -D warnings — pass.
This commit is contained in:
@@ -8,7 +8,10 @@
|
||||
//! P6-2 adds the [`ocr`] module: an [`OcrEngine`] trait and an
|
||||
//! [`OllamaVisionOcr`] default adapter that talks to a vision-capable
|
||||
//! Ollama model. [`apply_ocr`] is the helper that mutates an
|
||||
//! [`ImageRefBlock`] in place.
|
||||
//! [`ImageRefBlock`] in place. Trust note — the LLM-driven default
|
||||
//! can hallucinate; `OcrText.engine` carries the source identity so
|
||||
//! consumers can branch trust by engine (Tesseract / Apple Vision
|
||||
//! adapters, when added, will write a different `engine` string).
|
||||
//!
|
||||
//! Per design §3.4 (Block::ImageRef + ImageRefBlock), §3.7a (OcrText /
|
||||
//! ModelCaption stubs), §9.1 (image extraction policy / OCR vs caption
|
||||
|
||||
@@ -116,6 +116,7 @@ pub fn apply_ocr(
|
||||
/// Ollama-vision OCR adapter — POSTs the image (base64) to
|
||||
/// `<endpoint>/api/generate` with a transcription prompt and reads the
|
||||
/// non-streaming response.
|
||||
#[derive(Debug)]
|
||||
pub struct OllamaVisionOcr {
|
||||
client: reqwest::blocking::Client,
|
||||
endpoint: String,
|
||||
@@ -458,4 +459,44 @@ mod tests {
|
||||
let p = engine.build_prompt(Some(&Lang("und".into())));
|
||||
assert!(!p.contains("hint:"));
|
||||
}
|
||||
|
||||
/// `from_parts` (and by extension `new`) must reject an empty
|
||||
/// endpoint string. Pinned so the bail message stays grep-able and
|
||||
/// the constructor cannot drift to "silently accept a bad config".
|
||||
#[test]
|
||||
fn build_rejects_empty_endpoint() {
|
||||
let r = OllamaVisionOcr::from_parts("", "m", vec![], 1024);
|
||||
let err = r.expect_err("empty endpoint must bail").to_string();
|
||||
assert!(
|
||||
err.contains("endpoint is empty"),
|
||||
"bail message missing 'endpoint is empty': {err}"
|
||||
);
|
||||
}
|
||||
|
||||
/// Whitespace-only model id trims to empty and must be rejected —
|
||||
/// both `new` and `from_parts` route through the shared `build`,
|
||||
/// so testing `from_parts` covers both.
|
||||
#[test]
|
||||
fn build_rejects_empty_model_after_trim() {
|
||||
let r = OllamaVisionOcr::from_parts("http://x", " ", vec![], 1024);
|
||||
let err = r.expect_err("empty model must bail").to_string();
|
||||
assert!(
|
||||
err.contains("model is empty"),
|
||||
"bail message missing 'model is empty': {err}"
|
||||
);
|
||||
}
|
||||
|
||||
/// Out-of-range `max_pixels` is silently clamped (not rejected) so
|
||||
/// a bad config can't kill ingest. The accessor exposes the clamped
|
||||
/// value so tests can verify the bound; the warning side-effect is
|
||||
/// tested implicitly (no panic, no error).
|
||||
#[test]
|
||||
fn build_clamps_max_pixels_outside_legal_range() {
|
||||
let too_small =
|
||||
OllamaVisionOcr::from_parts("http://x", "m", vec![], 1).unwrap();
|
||||
assert_eq!(too_small.max_pixels(), MIN_LONG_EDGE);
|
||||
let too_big =
|
||||
OllamaVisionOcr::from_parts("http://x", "m", vec![], u32::MAX).unwrap();
|
||||
assert_eq!(too_big.max_pixels(), MAX_LONG_EDGE);
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user