Files
kebab/docs/wire-schema/v1/ocr_failures.schema.json
altair823 d9ec7b8dc3 feat(cli): kebab inspect ocr-stats + ocr-failures (Enhancement 3 + wire schema additive minor)
Two new wire schemas land as additive minor: ocr_stats.v1 (corpus-wide
aggregate — total_events, success_rate, p50/p90/p99/max_ms, by_engine,
top-10 by_doc by failure count) and ocr_failures.v1 (per-doc or
corpus-wide recent failures, with --doc-id + --limit). Both ship via
new CLI subcommands `kebab inspect ocr-stats` / `inspect ocr-failures`.

App gains four facade methods: inspect_ocr_stats /
inspect_ocr_failures plus their *_with_config companions — required by
CLAUDE.md "the facade rule" so `--config <path>` is honored. The CLI
dispatch arms thread cfg explicitly into the _with_config form.

Runtime introspection emit (WIRE_SCHEMAS in schema.rs) gains two
entries; the meta JSON Schema (schema.schema.json) is untouched
because its wire.schemas is pattern-based, not enum-based.

ingest_log::percentiles extended to (p50, p90, p99, max). p99 surfaces
only via inspect ocr-stats; IngestSummary (round 1) stays 3-percentile.

SKILL.md synced with the two new schemas (AC-13).

Closure r2 G2 (facade *_with_config pair) + G3 (runtime emit, not
meta schema file) + closure r1 F4 (p99) resolved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 06:13:08 +00:00

25 lines
692 B
JSON

{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "ocr_failures.v1",
"type": "object",
"properties": {
"schema_version": { "const": "ocr_failures.v1" },
"doc_id": { "type": ["string", "null"] },
"failure_count": { "type": "integer" },
"failures": {
"type": "array",
"items": {
"type": "object",
"properties": {
"ts": { "type": "string" },
"page": { "type": "integer" },
"ms": { "type": "integer" },
"reason": { "type": "string" },
"image_byte_size": { "type": ["integer", "null"] }
}
}
}
},
"required": ["schema_version", "failure_count", "failures"]
}