-
v0.18.0 Stable
released this
2026-05-26 05:37:47 +00:00 | 245 commits to main since this releasev0.18.0 — fb-41 multi-hop RAG ship + NLI verification
post-PR-8 의 dogfood 에서 발견된 S7 caffeine hallucination 의 root cause = LLM-self-judge ceiling. 본 release 가 deterministic external verifier (mDeBERTa-v3 XNLI ONNX) 로 close. 학계 표준 (Self-RAG, CRAG, Auto-GDA, MedTrust-RAG) 정합.
새 surface
- CLI:
kebab ask --multi-hop <query>— multi-hop reasoning (decompose → decide → synthesize loop). - MCP:
asktool 의multi_hop: trueargument. - TUI: Ask 패널 의
F2toggle + multi-hop badge + hops summary inline.
새 wire (additive minor — pre-v0.18 reader 무영향)
answer.v1.hops— multi-hop per-iter trace ({kind, iter, sub_queries, context_chunks_added, llm_call_ms, forced_stop}per element).answer.v1.verification— NLI groundedness summary ({nli_score, nli_threshold, nli_passed}, present only whencfg.rag.nli_threshold > 0).error.v1.codeenum 확장:multi_hop_decompose_failed,nli_verification_failed,nli_model_unavailable.
새 config
[rag] nli_threshold(default0.0= disabled; production 권장0.5).[models.nli] model(defaultXenova/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7).[models.nli] provider(defaultonnx— only impl in v0.18; v0.19+ candle/remote candidate).- env override:
KEBAB_RAG_NLI_THRESHOLD,KEBAB_MODELS_NLI_MODEL,KEBAB_MODELS_NLI_PROVIDER.
새 RefusalReason
multi_hop_decompose_failed(LLM decompose JSON parse fail).nli_verification_failed(entailment <nli_threshold).nli_model_unavailable(download / inference 실패 — fail-closed; 사용자 우회 =[rag] nli_threshold = 0).
권장 환경
- LLM: gemma3:4b (CPU only, 16 GB RAM 권장).
- NLI 활성화 시: ~280 MB first-run download to
{data_dir}/models/nli/Xenova_mDeBERTa-v3-base-xnli-multilingual-nli-2mil7/. - RAM peak (NLI + Ollama 동시, gemma3:4b 기준): ~7-8 GB (dogfood 실측, ONNX session ~600 MB 추가). 16 GB 환경 안전.
- 8B+ Q4 모델 (
gemma4:e4b8B /gemma2:9b등): 추정 peak ~10 GB — 16 GB 경계, OOM 위험.
Known limitations (v0.18.1+ priority)
- single-pass
ask는 NLI 미적용 (LlmSelfJudge 유지). multi-hop 만 step 8.5 gate. - atomic claim split 미적용 — entire answer 가 1 NLI call (paraphrase / multi-claim 답변의 평균 entailment 사용).
- GPU acceleration 미지원 (CPU ONNX runtime).
- S3 류 query (특정 input dependent
nli_model_unavailable) consistent fail — v0.18.1 follow-up (HOTFIXES 의 fb-41 PR-9 closure entry 의 S3 subsection 참조).
도그푸딩
docs/dogfood/v0.18.0/SUMMARY.md+ 4 sample wire JSON (S7 / S1 / S3 / S10 multi-hop ask 결과). 핵심: S7 caffeine hallucination root cause 해결 확정 — PR-8 baselinegrounded=true + Adam gradient hallucination (silent)→ PR-9refusal_reason=nli_verification_failed, nli_score=0.0035 (graceful).Sub-PR sequence
- PR #176 PR-9a —
kebab-nlicrate skeleton + workspace deps (ort 2.0-rc.9 / tokenizers 0.21 onig / hf-hub 0.4 / ndarray 0.16). - PR #177 PR-9b —
OnnxNliVerifierONNX inference + lazy hf-hub download. - PR #178 PR-9c-1 — core types + wire scaffolding (
RefusalReason::Nli*+Answer.verification+RagPipeline.verifier+ config knobs). - PR #179 PR-9c-2 — pipeline integration ★ NLI 실 활성화 (step 8.5 hook +
App::open_with_configNLI verifier construction + 5 mock multi-hop tests + SKILL.md). - PR #180 PR-9d — dogfood retest + HOTFIXES closure +
docs/dogfood/v0.18.0/보존. - PR #181 cleanup — workspace-wide
clippy::pedanticbaseline + post-PR9 refactor (H1 config wiring + 9 new tests). - PR #182 cut — version bump + cascading docs.
Cascade rule
v0.18.0 release 후
kebab.sqliteschema / wire schema 의 breaking 변경 없음 (additive minor only). 사용자 작업 불필요 — 기존 ingest / config 모두 호환.Downloads
- CLI: