fix(rag): S3 NLI unavailable — hypothesis char budget + token-count fallback retry
S3 dogfood query 의 `nli_model_unavailable` consistent fail root cause = mDeBERTa-v3 tokenizer 의 `OnlyFirst` strategy + 949-token hypothesis. 기존 char-budget 단독 fix 의 KR-extreme density 미해결 → token-count fallback retry + RC1-residual trait dispatch 정합. 핵심 변경: - kebab-nli::NliVerifier: `hypothesis_token_count(&str) -> Result<usize>` trait method 추가 (default `Ok(0)` backward-compat). `OnnxNliVerifier` 가 *trait impl block* 안에서 real mDeBERTa tokenize override — vtable 등록 보장 (round-3 critic RC1-residual closure). - kebab-rag::pipeline: `MAX_NLI_HYPOTHESIS_CHARS_INITIAL = 1200` + `MAX_NLI_HYPOTHESIS_CHARS_MIN = 150` const + `pub(crate) fn truncate_chars` pure-fn + `pub fn truncate_hypothesis_for_nli_with_budget` retry helper (char budget 반감 retry, min floor 시 graceful unavailable). step 8.5 hook 의 callsite explicit `match` + `return self.refuse_nli_model_unavailable` 패턴 (`?` 금지 — round-2 plan critic CRITICAL #1 closure). - SpyNliVerifier 신규 helper (closure score_fn + hypothesis_token_count_fn, 2-arg constructor). - §5.1 의 2 ignored test (EN-long err + vtable dispatch RC1-residual pin) + §5.2 의 4 boundary test (truncate_chars) + §5.3 의 3 mock multi-hop test (long_en_grounded / long_kr_retries / unrelenting_fallback). +7 new tests (2 ignored default skip). - tasks/HOTFIXES.md 신규 dated entry `## 2026-05-26 — S3 NLI unavailable ...` — Symptom / Root cause / Action / Amends 4-block. - spec + plan (`docs/superpowers/{specs,plans}/2026-05-26-s3-nli-model-unavailable-diagnose-*.md`) — 4 round spec + 3 round plan OMC reviewer ACCEPT 산출물. 검증: - cargo test -p kebab-nli -j 1 → 11/11 pass + 7 ignored default skip. - cargo test -p kebab-rag -j 1 → 19+3+3+... 전체 pass + 3 new mock + 4 new boundary. - cargo test --workspace --no-fail-fast -j 1 → **1313 pass (+7 new)**, 0 failed. 회귀 0 (HOTFIX #15 이미 fixed, no remaining flaky). - cargo clippy --workspace --all-targets -j 1 -- -D warnings clean (type_complexity allow on Arc<dyn Fn> type aliases). KR safe (token-count retry path) + graceful fallback (min floor 시 기존 unavailable wire 유지, regression 0). Wire 영향 없음 (additive trait method). Cargo bump 불필요. Refs: - spec: docs/superpowers/specs/2026-05-26-s3-nli-model-unavailable-diagnose-spec.md (4 round APPROVE — analyst → critic + verifier × 4 rounds) - plan: docs/superpowers/plans/2026-05-26-s3-nli-model-unavailable-diagnose-plan.md (3 round ACCEPT — planner → critic-plan + verifier-plan × 3 rounds) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -455,3 +455,52 @@ impl NliVerifier for MockNliVerifier {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// S3 follow-up (2026-05-26): closure type aliases for `SpyNliVerifier`
|
||||
/// fields — clippy `type_complexity` 회피.
|
||||
#[allow(clippy::type_complexity)]
|
||||
pub type ScoreFn = Arc<dyn Fn(&str, &str) -> anyhow::Result<NliScores> + Send + Sync>;
|
||||
#[allow(clippy::type_complexity)]
|
||||
pub type HypothesisTokenCountFn = Arc<dyn Fn(&str) -> anyhow::Result<usize> + Send + Sync>;
|
||||
|
||||
/// S3 follow-up (2026-05-26): closure-based NLI verifier — caller 가
|
||||
/// `(premise, hypothesis) -> Result<NliScores>` + `(hypothesis) ->
|
||||
/// Result<usize>` 두 closures 정의 가능 + spy 로 입력 capture. 기존
|
||||
/// `MockNliVerifier` (고정 mode) 와 sibling. truncate_hypothesis_for_nli_with_budget
|
||||
/// retry loop 의 token-count 시뮬레이션 + final score 동작을 한 verifier
|
||||
/// 안에서 inject 하기 위함.
|
||||
pub struct SpyNliVerifier {
|
||||
pub score_fn: ScoreFn,
|
||||
pub hypothesis_token_count_fn: HypothesisTokenCountFn,
|
||||
pub received_premises: Mutex<Vec<String>>,
|
||||
pub received_hypotheses: Mutex<Vec<String>>,
|
||||
}
|
||||
|
||||
impl SpyNliVerifier {
|
||||
/// 2-arg constructor — score + hypothesis_token_count 둘 다 closure
|
||||
/// 로 주입. `Arc<T>` field mutation 불가 회피를 위해 한 번에 전달.
|
||||
pub fn new<F, G>(score_fn: F, token_count_fn: G) -> Arc<Self>
|
||||
where
|
||||
F: Fn(&str, &str) -> anyhow::Result<NliScores> + Send + Sync + 'static,
|
||||
G: Fn(&str) -> anyhow::Result<usize> + Send + Sync + 'static,
|
||||
{
|
||||
Arc::new(Self {
|
||||
score_fn: Arc::new(score_fn),
|
||||
hypothesis_token_count_fn: Arc::new(token_count_fn),
|
||||
received_premises: Mutex::new(Vec::new()),
|
||||
received_hypotheses: Mutex::new(Vec::new()),
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
impl NliVerifier for SpyNliVerifier {
|
||||
fn score(&self, premise: &str, hypothesis: &str) -> anyhow::Result<NliScores> {
|
||||
self.received_premises.lock().unwrap().push(premise.to_string());
|
||||
self.received_hypotheses.lock().unwrap().push(hypothesis.to_string());
|
||||
(self.score_fn)(premise, hypothesis)
|
||||
}
|
||||
|
||||
fn hypothesis_token_count(&self, hypothesis: &str) -> anyhow::Result<usize> {
|
||||
(self.hypothesis_token_count_fn)(hypothesis)
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user