feat(nli): fb-41 PR-9b — OnnxNliVerifier ONNX inference + model download #177

altair823 · 2026-05-25T22:19:05Z

altair823 commented

2026-05-25 22:19:05 +00:00

요약

PR-9a 의 trait surface 위에 실제 ONNX inference 실 구현. mDeBERTa-v3 XNLI 모델의 download / cache / forward pass 가 작동. 단 caller 는 여전히 0 (ask_multi_hop 의 hook 은 PR-9c-2 가 추가).

설계: docs/superpowers/specs/2026-05-25-p9-fb-41-finalize-spec.md (§3 PR-9b)
계획: docs/superpowers/plans/2026-05-25-p9-fb-41-finalize-plan.md (§3)

변경 사항

Cargo deps 활성화 (commit 1)

crates/kebab-nli/Cargo.toml 에 ort, tokenizers, hf-hub, ndarray, tracing 추가 (모두 workspace 경유, PR-9a 가 workspace.dependencies 에 declared 만 했음).

OnnxNliVerifier 실 구현 (commit 2)

Fields: model_id, cache_dir (XDG model_dir/nli/<sanitized>), session: OnceLock<ort::Session>, tokenizer: OnceLock<tokenizers::Tokenizer>.
new(&Config): 의도적 eager error only — cache_dir create_dir_all 단계만. 실 model load 는 lazy.
ensure_loaded(): 첫 호출 시 hf-hub download (cache hit / miss 로깅) + Tokenizer::from_file + Session::commit_from_file + truncation params (OnlyFirst, max_length=512, premise 끝부터 truncate — hypothesis 보전).
score(premise, hypothesis): empty hypothesis defense-in-depth bail → tokenizer.encode (pair) → ndarray Array2 [1, seq_len] → Session::run → logits[1, 3] → NliScores::from_xnli_logits.
sanitize_model_id: / → _.

Test 2 expectation 정정 (commit 3)

spec §3 PR-9b 의 expectation entailment < 0.3 가 너무 strict — 두 caffeine 사실 ("Caffeine is a stimulant." vs "The chemical formula of caffeine is C8H10N4O2.") 의 measured NLI 가 neutral 0.53, entailment 0.43, contradiction 0.05 — 즉 서로 entail 안 하지만 모순 아님 = neutral. spec 의 spirit ("neutral/contradiction 이 winning channel") 유지하고 assertion 만 neutral > entailment && neutral > contradiction 으로 변경.

Tests (5 #[ignore] integration in `crates/kebab-nli/tests/inference.rs`)

cargo test default 는 skip (CI 부담 회피). 수동 --ignored 만 실행.

검증

Unit (`cargo test -p kebab-nli -j 1`)

PR-9a 의 4 lib.rs unit + PR-9b 의 onnx::tests::new_succeeds_on_default_config + score_empty_hypothesis_returns_err (PR-9a 의 score_returns_err_in_skeleton 을 의미 보존하며 rename) → 6 passed / 0 failed.

Clippy

cargo clippy -p kebab-nli --all-targets -j 1 -- -D warnings clean.

Workspace

cargo test --workspace --no-fail-fast -j 1 — pre-existing kebab-mcp::tools_call_ask_multi_hop::ask_tool_routes_multi_hop_true_to_decompose_first 1 fail (PR-9a 때 main baseline 에서도 동일 fail 확인, HOTFIXES candidate, PR-9b 무관). 그 외 모두 통과.

Manual `--ignored` smoke (PR-9b 첨부 필수, sequential)

cargo test -p kebab-nli -j 1 --test inference -- --ignored --test-threads=1

결과 5/5 PASS (9.00s):

test	premise → hypothesis	scores	판정
`en_self_entailment_high_score`	"Caffeine is a stimulant." → 같은 문장	entailment=0.9928, neutral=0.0067, contradiction=0.0005	✓ (>0.8)
`en_unrelated_low_entailment`	caffeine → C8H10N4O2 사실	entailment=0.4251, neutral=0.5256 wins, contradiction=0.0493	✓ (neutral wins)
`ko_entailment_high_score`	"사과는 빨갛다." → "사과는 색이 있다."	entailment=0.9923, neutral=0.0062, contradiction=0.0015	✓ (>0.5)
`long_premise_truncates_without_panic`	"foo bar baz " × 2000 (24000 char) → "foo"	entailment=0.4541, neutral=0.2412, contradiction=0.3048	✓ (Ok, no panic)
`empty_hypothesis_returns_err`	"anything" → ""	—	✓ (err msg "empty hypothesis")

중요 발견: cargo test default 의 multi-thread 가 hf-hub 의 file-lock 과 충돌 — 같은 model 동시 download 시도 시 lock fail. --test-threads=1 권장. PR-9c-2 의 mock-based unit test 는 multi-thread OK (model 비의존).

비범위

Pipeline integration (PR-9c-2 의 ask_multi_hop step 8.5 hook).
Config knob ([models.nli].model + [rag] nli_threshold) — PR-9c-1.
RefusalReason variants + wire schema — PR-9c-1.
Dogfood retest — PR-9d.

시험 항목 (Test Plan)

cargo test -p kebab-nli -j 1 6/6 unit pass.
cargo clippy -p kebab-nli --all-targets -j 1 -- -D warnings clean.
cargo test --workspace -j 1 회귀 0 (pre-existing 1 fail = PR-9a 때 baseline 동일, PR-9b 무관).
cargo test -p kebab-nli --test inference -- --ignored --test-threads=1 5/5 PASS + 측정값 표 inline.

Assisted-by: Claude Code

## 요약 PR-9a 의 trait surface 위에 **실제 ONNX inference** 실 구현. mDeBERTa-v3 XNLI 모델의 download / cache / forward pass 가 작동. 단 caller 는 여전히 0 (`ask_multi_hop` 의 hook 은 PR-9c-2 가 추가). 설계: docs/superpowers/specs/2026-05-25-p9-fb-41-finalize-spec.md (§3 PR-9b) 계획: docs/superpowers/plans/2026-05-25-p9-fb-41-finalize-plan.md (§3) ## 변경 사항 ### Cargo deps 활성화 (commit 1) `crates/kebab-nli/Cargo.toml` 에 `ort`, `tokenizers`, `hf-hub`, `ndarray`, `tracing` 추가 (모두 workspace 경유, PR-9a 가 workspace.dependencies 에 declared 만 했음). ### OnnxNliVerifier 실 구현 (commit 2) - **Fields**: `model_id`, `cache_dir` (XDG `model_dir/nli/<sanitized>`), `session: OnceLock<ort::Session>`, `tokenizer: OnceLock<tokenizers::Tokenizer>`. - **`new(&Config)`**: 의도적 eager error only — `cache_dir` create_dir_all 단계만. 실 model load 는 lazy. - **`ensure_loaded()`**: 첫 호출 시 hf-hub download (cache hit / miss 로깅) + `Tokenizer::from_file` + `Session::commit_from_file` + truncation params (`OnlyFirst`, `max_length=512`, premise 끝부터 truncate — hypothesis 보전). - **`score(premise, hypothesis)`**: empty hypothesis defense-in-depth bail → tokenizer.encode (pair) → ndarray Array2<i64> [1, seq_len] → `Session::run` → logits[1, 3] → `NliScores::from_xnli_logits`. - **`sanitize_model_id`**: `/` → `_`. ### Test 2 expectation 정정 (commit 3) spec §3 PR-9b 의 expectation `entailment < 0.3` 가 너무 strict — 두 caffeine 사실 (`"Caffeine is a stimulant."` vs `"The chemical formula of caffeine is C8H10N4O2."`) 의 measured NLI 가 *neutral 0.53, entailment 0.43, contradiction 0.05* — 즉 *서로 entail 안 하지만 모순 아님 = neutral*. spec 의 spirit ("neutral/contradiction 이 winning channel") 유지하고 assertion 만 `neutral > entailment && neutral > contradiction` 으로 변경. ### Tests (5 #[ignore] integration in `crates/kebab-nli/tests/inference.rs`) cargo test default 는 skip (CI 부담 회피). 수동 `--ignored` 만 실행. ## 검증 ### Unit (`cargo test -p kebab-nli -j 1`) PR-9a 의 4 lib.rs unit + PR-9b 의 onnx::tests::new_succeeds_on_default_config + score_empty_hypothesis_returns_err (PR-9a 의 score_returns_err_in_skeleton 을 의미 보존하며 rename) → **6 passed / 0 failed**. ### Clippy `cargo clippy -p kebab-nli --all-targets -j 1 -- -D warnings` clean. ### Workspace `cargo test --workspace --no-fail-fast -j 1` — **pre-existing kebab-mcp::tools_call_ask_multi_hop::ask_tool_routes_multi_hop_true_to_decompose_first 1 fail** (PR-9a 때 main baseline 에서도 동일 fail 확인, HOTFIXES candidate, PR-9b 무관). 그 외 모두 통과. ### Manual `--ignored` smoke (PR-9b 첨부 필수, sequential) ```sh cargo test -p kebab-nli -j 1 --test inference -- --ignored --test-threads=1 ``` **결과 5/5 PASS (9.00s)**: | test | premise → hypothesis | scores | 판정 | |---|---|---|---| | `en_self_entailment_high_score` | "Caffeine is a stimulant." → 같은 문장 | **entailment=0.9928**, neutral=0.0067, contradiction=0.0005 | ✓ (>0.8) | | `en_unrelated_low_entailment` | caffeine → C8H10N4O2 사실 | entailment=0.4251, **neutral=0.5256** wins, contradiction=0.0493 | ✓ (neutral wins) | | `ko_entailment_high_score` | "사과는 빨갛다." → "사과는 색이 있다." | **entailment=0.9923**, neutral=0.0062, contradiction=0.0015 | ✓ (>0.5) | | `long_premise_truncates_without_panic` | "foo bar baz " × 2000 (24000 char) → "foo" | entailment=0.4541, neutral=0.2412, contradiction=0.3048 | ✓ (Ok, no panic) | | `empty_hypothesis_returns_err` | "anything" → "" | — | ✓ (err msg "empty hypothesis") | **중요 발견**: cargo test default 의 multi-thread 가 hf-hub 의 file-lock 과 충돌 — 같은 model 동시 download 시도 시 lock fail. `--test-threads=1` 권장. PR-9c-2 의 mock-based unit test 는 multi-thread OK (model 비의존). ## 비범위 - Pipeline integration (PR-9c-2 의 `ask_multi_hop` step 8.5 hook). - Config knob (`[models.nli].model` + `[rag] nli_threshold`) — PR-9c-1. - RefusalReason variants + wire schema — PR-9c-1. - Dogfood retest — PR-9d. ## 시험 항목 (Test Plan) - [x] cargo test -p kebab-nli -j 1 6/6 unit pass. - [x] cargo clippy -p kebab-nli --all-targets -j 1 -- -D warnings clean. - [x] cargo test --workspace -j 1 회귀 0 (pre-existing 1 fail = PR-9a 때 baseline 동일, PR-9b 무관). - [x] cargo test -p kebab-nli --test inference -- --ignored --test-threads=1 5/5 PASS + 측정값 표 inline. Assisted-by: Claude Code

altair823 added 3 commits 2026-05-25 22:19:06 +00:00

feat(nli): fb-41 PR-9b prep — activate ort/tokenizers/hf-hub/ndarray/tracing deps in kebab-nli 93436f9eca

PR-9a 의 workspace.dependencies 만 declared 였던 5 crate 의존을 kebab-nli/Cargo.toml 에 활성화. PR-9b 의 OnnxNliVerifier 실 구현이 본 commit 위에서 빌드 가능.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(nli): fb-41 PR-9b — OnnxNliVerifier 의 ONNX inference + model download b807fd5aa5

- OnnxNliVerifier fields: model_id, cache_dir (XDG model_dir/nli/<sanitized>), session/tokenizer OnceLock.
- new(): eager cache_dir stamp만 — actual model download + Session::commit_from_file 는 첫 score 호출 시 ensure_loaded() 가 lazy 수행.
- score(): ensure_loaded → tokenizer.encode(pair, OnlyFirst truncation max_length=512) → ndarray Array2<i64> → ort::Session::run → logits[1,3] → NliScores::from_xnli_logits.
- empty hypothesis edge: defense-in-depth bail (spec §2.3 의 caller-side skip 외 추가).
- sanitize_model_id helper: "/" → "_".
- 5 #[ignore] integration tests (EN self-entailment, EN unrelated, KR entailment, long premise truncation, empty hypothesis err) — manual smoke 가 PR description 첨부.

Cargo.toml: `download-binaries` feature 를 kebab-nli 의 ort dep 에 활성화 (PR-9b prep commit 의 후속). 단독 `cargo test -p kebab-nli` 의 per-crate feature 유니온은 fastembed 없이 ort/download-binaries 가 OFF 되어 ort-sys link 가 실패 — kebab-nli 측에서 명시적으로 켜 줘야 standalone build 가 ONNX 런타임 link 됨. workspace 전체 빌드에서는 fastembed 의 동일 opt-in 과 union 되어 부작용 없음.

Verification:
- cargo test -p kebab-nli -j 1 — PR-9a 의 6 unit pass (`score_returns_err_in_skeleton` → `score_empty_hypothesis_returns_err` 로 stub→실 path 갱신, 갯수 유지).
- cargo clippy -p kebab-nli --all-targets -- -D warnings clean.
- cargo build --workspace -j 1 — 회귀 0.
- Manual --ignored smoke 결과 PR body 첨부.

Wire 영향: 없음 (crate-internal).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chore(nli): PR-9b inference test 2 의 expectation 정정 ab3408cb49

기존 expectation `entailment < 0.3` 가 너무 strict — mDeBERTa-v3 multilingual NLI 가 두 caffeine 사실 (premise: "Caffeine is a stimulant.", hypothesis: "The chemical formula of caffeine is C8H10N4O2.") 의 *neutral* 을 0.53 으로, entailment 를 0.43 으로 판단함 (서로 entail 안 하지만 모순도 아님 = 정확히 neutral).

spec §3 PR-9b 의 "entailment 낮음 — neutral/contradiction 이 winning channel" 의 *spirit* 은 *neutral 이 max* 임. expectation 을 `s.neutral > s.entailment && s.neutral > s.contradiction` 로 변경.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude-reviewer-01 requested changes 2026-05-25 22:20:36 +00:00

Dismissed

claude-reviewer-01 left a comment

회차 1 — OnnxNliVerifier 실 구현 검토.

긍정 (의도적으로 inline 안 함):

module-level doc + per-section constants (DEFAULT_MODEL_ID / HF_MODEL_FILE / NLI_CACHE_SUBDIR / LOGITS_LEN / MAX_TOKENS) 로 magic number 제거 — model swap 시 single-site change.
expand_path 사용 — kebab-config 의 XDG path expansion 기존 패턴 (kebab-embed-local 와 동일 two-step) 정합.
ensure_loaded 의 race comment (OnceLock::get_or_try_init 미stable 대안) + 두 OnceLock 분리 정당화 명료.
score 의 defense-in-depth empty hypothesis bail + logits shape defensive check (!= [1, 3] bail — model swap 시 wrong score silent emit 회피).
truncation OnlyFirst + direction Right (premise 끝부터 잘림, hypothesis 보전) — spec §2.2.3 의 핵심 invariant 정확.
error context 풍부 (with_context(|| format!("..."))).
Test 2 expectation 정정 commit 의 spec spirit ("neutral wins") 보존 + comment 로 trade-off 명시.
Manual --ignored smoke 5/5 PASS (sequential) — measured NliScores dump inline.

작은 nit 2건 inline. 모두 minor (functional 회귀 없음, follow-up OK):

N1: fetch 의 cache-hit 검사가 실제론 download 트리거 → log misleading.
N2: new_succeeds_on_default_config 가 실 XDG 디렉토리 생성 → test pollution.

PR-9c-1 의 [models.nli] config 도입 시 함께 정리 가능한 항목이므로 둘 다 advisory. 그러나 fetch 의 log misleading 은 진단 시 confusion 야기 — 회차 1 안에 반영 권장.

회차 1 — OnnxNliVerifier 실 구현 검토. 긍정 (의도적으로 inline 안 함): - module-level doc + per-section constants (DEFAULT_MODEL_ID / HF_MODEL_FILE / NLI_CACHE_SUBDIR / LOGITS_LEN / MAX_TOKENS) 로 magic number 제거 — model swap 시 single-site change. - `expand_path` 사용 — kebab-config 의 XDG path expansion 기존 패턴 (kebab-embed-local 와 동일 two-step) 정합. - `ensure_loaded` 의 race comment (OnceLock::get_or_try_init 미stable 대안) + 두 OnceLock 분리 정당화 명료. - `score` 의 defense-in-depth empty hypothesis bail + logits shape defensive check (`!= [1, 3]` bail — model swap 시 wrong score silent emit 회피). - truncation `OnlyFirst` + `direction Right` (premise 끝부터 잘림, hypothesis 보전) — spec §2.2.3 의 핵심 invariant 정확. - error context 풍부 (`with_context(|| format!("..."))`). - Test 2 expectation 정정 commit 의 spec spirit ("neutral wins") 보존 + comment 로 trade-off 명시. - Manual `--ignored` smoke 5/5 PASS (sequential) — measured NliScores dump inline. 작은 nit 2건 inline. 모두 minor (functional 회귀 없음, follow-up OK): - N1: `fetch` 의 cache-hit 검사가 실제론 download 트리거 → log misleading. - N2: `new_succeeds_on_default_config` 가 실 XDG 디렉토리 생성 → test pollution. PR-9c-1 의 `[models.nli]` config 도입 시 함께 정리 가능한 항목이므로 둘 다 *advisory*. 그러나 fetch 의 log misleading 은 *진단 시 confusion* 야기 — 회차 1 안에 반영 권장.

crates/kebab-nli/src/onnx.rs

						
				@@ -26,0 +155,4 @@

				            ))

				            .get(filename)

				            .ok();

				        if cache_path.is_some() {

claude-reviewer-01 commented

2026-05-25 22:20:36 +00:00

N1 (minor): cache hit 검사 path 가 실은 download 트리거 — line 151 api.repo(...).get(filename).ok() 의 ApiRepo::get 은 cache miss 시 download 후 로컬 path 반환. 즉 cache miss 인 경우 이 line 이 download 트리거 + line 175 의 두번째 repo.get(filename) 가 idempotent cache hit. 결과: log "NLI artifact cache hit" 가 방금 download 한 직후 출력 — 사용자 진단 시 misleading.

Fix 옵션:

hf_hub::Cache::new(cache_dir).repo(repo).get(filename) — Cache::get 은 download 없이 fs lookup 만. 정확한 hit/miss 신호.
또는 단순 std::fs::metadata(cache_dir.join(<blob>)) 로 file 존재만 확인 (HF cache layout 의 blob path 알려져 있음).

actual download 는 한 번 (두번째 호출이 idempotent) — 기능 회귀 없음, log accuracy 만 영향. severity minor.

**N1 (minor)**: cache hit 검사 path 가 실은 download 트리거 — line 151 `api.repo(...).get(filename).ok()` 의 `ApiRepo::get` 은 cache miss 시 *download 후* 로컬 path 반환. 즉 cache miss 인 경우 이 line 이 download 트리거 + line 175 의 두번째 `repo.get(filename)` 가 idempotent cache hit. 결과: log `"NLI artifact cache hit"` 가 *방금 download 한 직후* 출력 — 사용자 진단 시 misleading. Fix 옵션: - `hf_hub::Cache::new(cache_dir).repo(repo).get(filename)` — `Cache::get` 은 download 없이 fs lookup 만. 정확한 hit/miss 신호. - 또는 단순 `std::fs::metadata(cache_dir.join(<blob>))` 로 file 존재만 확인 (HF cache layout 의 blob path 알려져 있음). actual download 는 한 번 (두번째 호출이 idempotent) — 기능 회귀 없음, log accuracy 만 영향. severity minor.

crates/kebab-nli/src/onnx.rs Outdated

						
				@@ -41,17 +295,29 @@ mod tests {

				    fn new_succeeds_on_default_config() {

				        let cfg = Config::defaults();

				        let v = OnnxNliVerifier::new(&cfg).expect("new should succeed on default config");

claude-reviewer-01 commented

2026-05-25 22:20:36 +00:00

N2 (minor): new_succeeds_on_default_config 가 Config::defaults() 의 XDG model_dir 의 실 디렉토리 (~/.local/share/kebab/models/nli/Xenova_...) 를 create_dir_all 함. test 가 cleanup 안 해서 test pollution. 디렉토리만 생기고 model 은 안 받으므로 disk 영향 미미하지만, isolated test 원칙엔 어긋남.

Fix: tempfile::tempdir() 로 임시 디렉토리 + Config 의 storage.data_dir / model_dir 를 임시 path 로 override. 또는 환경변수 (KEBAB_DATA_DIR 같은게 있다면) 로 redirect.

severity minor — PR-9c-1 의 [models.nli] config knob 도입 시 함께 정리 가능.

**N2 (minor)**: `new_succeeds_on_default_config` 가 `Config::defaults()` 의 XDG model_dir 의 실 디렉토리 (`~/.local/share/kebab/models/nli/Xenova_...`) 를 `create_dir_all` 함. test 가 cleanup 안 해서 *test pollution*. 디렉토리만 생기고 model 은 안 받으므로 disk 영향 미미하지만, isolated test 원칙엔 어긋남. Fix: `tempfile::tempdir()` 로 임시 디렉토리 + `Config` 의 storage.data_dir / model_dir 를 임시 path 로 override. 또는 환경변수 (`KEBAB_DATA_DIR` 같은게 있다면) 로 redirect. severity minor — PR-9c-1 의 `[models.nli]` config knob 도입 시 함께 정리 가능.

altair823 referenced this issue from a commit

2026-05-25 22:22:36 +00:00

chore(nli): PR #177 회차 1 리뷰 반영 (N1 cache-hit probe + N2 test pollution)

altair823 added 1 commit 2026-05-25 22:22:36 +00:00

chore(nli): PR #177 회차 1 리뷰 반영 (N1 cache-hit probe + N2 test pollution) 6ffbe0a5a3

- N1: fetch 의 cache-hit 검사 경로가 실제로는 download 트리거 (ApiRepo::get 가 cache miss 시 download 후 path 반환). log 의 "NLI artifact cache hit" 가 *방금 download 한 직후* 출력 — misleading. hf_hub::Cache::new(cache_dir).repo(repo).get(filename).is_some() 로 변경 — Cache::get 은 fs lookup only, 네트워크 안 탐. actual download 횟수는 변화 없음 (1번), log accuracy 만 개선.
- N2: new_succeeds_on_default_config / score_empty_hypothesis_returns_err 가 XDG 실 디렉토리 (`~/.local/share/kebab/models/nli/...`) 를 create_dir_all → test pollution. tempdir_config() 헬퍼 추가 — TempDir 으로 storage.data_dir override, model_dir 는 `{data_dir}/models` 그대로 두어 expand_path 의 substitution 검증도 유지.

cargo test -p kebab-nli -j 1 → 6 passed / 0 failed (unit) + 5 ignored (integration, manual).
cargo clippy -p kebab-nli --all-targets -j 1 -- -D warnings clean.
inference.rs 미수정 → manual --ignored smoke 결과 (5/5 PASS) 그대로 유효.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude-reviewer-01 approved these changes 2026-05-25 22:22:51 +00:00

claude-reviewer-01 left a comment

회차 2 — 회차 1 N1 + N2 모두 반영 확인.

N1 closure: hf_hub::Cache::new(cache_dir).repo(repo).get(filename).is_some() 로 fs-only probe. download 트리거 회피 + log "cache hit/miss" 정확. actual download 횟수 그대로 (1번 idempotent), log accuracy 만 개선.
N2 closure: tempdir_config() 헬퍼 — TempDir 으로 storage.data_dir override + model_dir 는 {data_dir}/models 유지 (expand_path substitution 검증도 보존). 두 unit test 가 isolated.

cargo test -p kebab-nli -j 1 → 6/6 unit pass + 5 ignored. cargo clippy clean. inference.rs 미수정 → manual --ignored 5/5 PASS 그대로 유효.

추가 actionable 없음. PR-9b scope (OnnxNliVerifier 의 실 ONNX inference + model download) 완성. PR-9c-1 가 RefusalReason + wire schema + config knob 추가할 자연스러운 baseline.

머지 OK.

회차 2 — 회차 1 N1 + N2 모두 반영 확인. - N1 closure: `hf_hub::Cache::new(cache_dir).repo(repo).get(filename).is_some()` 로 fs-only probe. download 트리거 회피 + log "cache hit/miss" 정확. actual download 횟수 그대로 (1번 idempotent), log accuracy 만 개선. - N2 closure: `tempdir_config()` 헬퍼 — TempDir 으로 storage.data_dir override + model_dir 는 `{data_dir}/models` 유지 (expand_path substitution 검증도 보존). 두 unit test 가 isolated. cargo test -p kebab-nli -j 1 → 6/6 unit pass + 5 ignored. cargo clippy clean. inference.rs 미수정 → manual --ignored 5/5 PASS 그대로 유효. 추가 actionable 없음. PR-9b scope (OnnxNliVerifier 의 실 ONNX inference + model download) 완성. PR-9c-1 가 RefusalReason + wire schema + config knob 추가할 자연스러운 baseline. 머지 OK.

altair823 merged commit 79ad6e376f into main

2026-05-25 22:24:03 +00:00

altair823 deleted branch feat/fb-41-pr-9b-onnx-nli-inference

2026-05-25 22:24:04 +00:00

altair823 referenced this issue from a commit

2026-05-25 22:24:04 +00:00

Merge pull request 'feat(nli): fb-41 PR-9b — OnnxNliVerifier ONNX inference + model download' (#177) from feat/fb-41-pr-9b-onnx-nli-inference into main

Sign in to join this conversation.

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: altair823-org/kebab#177