chore(ocr): T11/T12 — clippy clean + docs + v0.27.0 bump

T11: fix 12 clippy lints in paddle_onnx.rs/paddle_e2e.rs (doc overindent,
finish_non_exhaustive, map_or_else, RangeInclusive::contains, cast_lossless,
is_some_and, usize::from). Full-workspace clippy -D warnings = 0.

Smoke (paddle-onnx, real binary): clean_paragraph OCR verbatim-correct, real
per-region confidence (0.99/0.96/0.95), FTS5 lexical hit on Korean(검색)+
English(embedding), parser_version folds |ocr:1:paddle-onnx:<ver>. Big page
<4s inference (5.6s ingest incl. one-time session load).

T12: README [image.ocr].engine + ARCHITECTURE OCR row + SMOKE paddle-onnx config
+ HANDOFF + HOTFIXES dated entry. Workspace version 0.26.2 → 0.27.0 (minor:
new engine value + config keys). .gitattributes: onnx as plain blobs (no git-lfs).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-04 08:36:10 +00:00
parent 8cc4e6d563
commit 375a0693e4
12 changed files with 114 additions and 55 deletions

9
.gitattributes vendored
View File

@@ -1,3 +1,6 @@
# PP-OCRv5 ONNX OCR models (paddle-onnx engine) — large binary, store via Git LFS
# to keep clone / `cargo package` lean. Enable with `git lfs install` before commit.
*.onnx filter=lfs diff=lfs merge=lfs -text
# PP-OCRv5 ONNX OCR models (paddle-onnx engine). git-lfs is not installed on
# this host, so they are committed as plain binary blobs (treated as binary —
# no textual diff/merge). If/when git-lfs becomes available, migrate with
# `git lfs migrate import --include='*.onnx'` and restore the filter line:
# *.onnx filter=lfs diff=lfs merge=lfs -text
*.onnx -text