feat(ocr): T2-T6 OnnxPaddleOcr core engine — det/rec ONNX + DBNet postproc + CTC
PP-OCRv5 ONNX OCR engine on the pinned ort rc.9 (no Python, no oar-ocr dep).
Implements the recognize() pipeline end-to-end (compiles + unit-tested):
- T2: OnnxPaddleOcr skeleton, OcrEngine impl, det/rec Session loaded once
(Mutex-wrapped → Send+Sync), engine_version = blake3(det+rec+dict) cached
once at construction, dict bounds-check (11945 lines vs 11947 rec classes).
- T2 preproc: det ImageNet mean/std NCHW + limit_side_len 960 → ×32 round
(golden 192x900→896x192 pinned); rec height-48 keep-aspect, (x-0.5)/0.5.
- T3 det postproc: threshold 0.3 → imageproc contours → min-area rect via
pure-Rust rotating calipers + convex hull → mean-prob box-score filter →
pure-Rust unclip(ratio 1.5). No clipper2/OpenCV.
- T4 crop+rectify: corner ordering + bilinear perspective warp to horizontal.
- T5 rec+CTC: greedy decode with the T0a-confirmed mapping
(idx0=blank, 1..=11945=dict[idx-1], 11946=space), rec-class bounds-check.
- T6 assembly: reading-order OcrText with per-region bbox + real confidence.
Unit tests (4 pass): det_target_dims golden, convex hull, min-area rect,
unclip expansion. Large *.onnx assets stay untracked pending T12 LFS decision.
Remaining: T7 config overrides, T8 factory (4 sites), T9 signature cascade,
T10 error matrix, T11 gates (clippy/e2e CER), T12 docs+bump+PR.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>