fix(rag): S3 NLI unavailable — hypothesis char budget + token-count fallback retry
S3 dogfood query 의 `nli_model_unavailable` consistent fail root cause = mDeBERTa-v3 tokenizer 의 `OnlyFirst` strategy + 949-token hypothesis. 기존 char-budget 단독 fix 의 KR-extreme density 미해결 → token-count fallback retry + RC1-residual trait dispatch 정합.
핵심 변경:
- kebab-nli::NliVerifier: `hypothesis_token_count(&str) -> Result<usize>` trait method 추가 (default `Ok(0)` backward-compat). `OnnxNliVerifier` 가 *trait impl block* 안에서 real mDeBERTa tokenize override — vtable 등록 보장 (round-3 critic RC1-residual closure).
- kebab-rag::pipeline: `MAX_NLI_HYPOTHESIS_CHARS_INITIAL = 1200` + `MAX_NLI_HYPOTHESIS_CHARS_MIN = 150` const + `pub(crate) fn truncate_chars` pure-fn + `pub fn truncate_hypothesis_for_nli_with_budget` retry helper (char budget 반감 retry, min floor 시 graceful unavailable). step 8.5 hook 의 callsite explicit `match` + `return self.refuse_nli_model_unavailable` 패턴 (`?` 금지 — round-2 plan critic CRITICAL #1 closure).
- SpyNliVerifier 신규 helper (closure score_fn + hypothesis_token_count_fn, 2-arg constructor).
- §5.1 의 2 ignored test (EN-long err + vtable dispatch RC1-residual pin) + §5.2 의 4 boundary test (truncate_chars) + §5.3 의 3 mock multi-hop test (long_en_grounded / long_kr_retries / unrelenting_fallback). +7 new tests (2 ignored default skip).
- tasks/HOTFIXES.md 신규 dated entry `## 2026-05-26 — S3 NLI unavailable ...` — Symptom / Root cause / Action / Amends 4-block.
- spec + plan (`docs/superpowers/{specs,plans}/2026-05-26-s3-nli-model-unavailable-diagnose-*.md`) — 4 round spec + 3 round plan OMC reviewer ACCEPT 산출물.
검증:
- cargo test -p kebab-nli -j 1 → 11/11 pass + 7 ignored default skip.
- cargo test -p kebab-rag -j 1 → 19+3+3+... 전체 pass + 3 new mock + 4 new boundary.
- cargo test --workspace --no-fail-fast -j 1 → **1313 pass (+7 new)**, 0 failed. 회귀 0 (HOTFIX #15 이미 fixed, no remaining flaky).
- cargo clippy --workspace --all-targets -j 1 -- -D warnings clean (type_complexity allow on Arc<dyn Fn> type aliases).
KR safe (token-count retry path) + graceful fallback (min floor 시 기존 unavailable wire 유지, regression 0). Wire 영향 없음 (additive trait method). Cargo bump 불필요.
Refs:
- spec: docs/superpowers/specs/2026-05-26-s3-nli-model-unavailable-diagnose-spec.md (4 round APPROVE — analyst → critic + verifier × 4 rounds)
- plan: docs/superpowers/plans/2026-05-26-s3-nli-model-unavailable-diagnose-plan.md (3 round ACCEPT — planner → critic-plan + verifier-plan × 3 rounds)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>