chore(ocr): T11/T12 — clippy clean + docs + v0.27.0 bump

T11: fix 12 clippy lints in paddle_onnx.rs/paddle_e2e.rs (doc overindent,
finish_non_exhaustive, map_or_else, RangeInclusive::contains, cast_lossless,
is_some_and, usize::from). Full-workspace clippy -D warnings = 0.

Smoke (paddle-onnx, real binary): clean_paragraph OCR verbatim-correct, real
per-region confidence (0.99/0.96/0.95), FTS5 lexical hit on Korean(검색)+
English(embedding), parser_version folds |ocr:1:paddle-onnx:<ver>. Big page
<4s inference (5.6s ingest incl. one-time session load).

T12: README [image.ocr].engine + ARCHITECTURE OCR row + SMOKE paddle-onnx config
+ HANDOFF + HOTFIXES dated entry. Workspace version 0.26.2 → 0.27.0 (minor:
new engine value + config keys). .gitattributes: onnx as plain blobs (no git-lfs).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-04 08:36:10 +00:00
parent 8cc4e6d563
commit 375a0693e4
12 changed files with 114 additions and 55 deletions

View File

@@ -33,7 +33,7 @@ fn cer(gt: &str, pred: &str) -> f64 {
for i in 1..=m {
let mut cur = vec![i; n + 1];
for j in 1..=n {
let cost = if g[i - 1] == p[j - 1] { 0 } else { 1 };
let cost = usize::from(g[i - 1] != p[j - 1]);
cur[j] = (prev[j] + 1).min(cur[j - 1] + 1).min(prev[j - 1] + cost);
}
prev = cur;
@@ -42,11 +42,10 @@ fn cer(gt: &str, pred: &str) -> f64 {
}
fn fixture_dir() -> PathBuf {
std::env::var("KEBAB_TEST_OCR_FIXTURE_DIR")
.map(PathBuf::from)
.unwrap_or_else(|_| {
PathBuf::from("/build/dogfood/corpus/images/synthetic-ocr-bench")
})
std::env::var("KEBAB_TEST_OCR_FIXTURE_DIR").map_or_else(
|_| PathBuf::from("/build/dogfood/corpus/images/synthetic-ocr-bench"),
PathBuf::from,
)
}
/// T10: undecodable image bytes must surface as an error (the kebab-app caller