Commit Graph

23 Commits

Author SHA1 Message Date
00ffe9c792 feat(rag): fb-41 PR-9c-2 — pipeline integration + mock test + SKILL.md (★ NLI 실 활성화)
PR-9c-1 의 wire surface 위에 behavior 활성화 — `ask_multi_hop` 의 step 8.5 hook 가 `[rag] nli_threshold > 0` 일 때 NLI 검증 실 수행. **첫 user-visible behavior change** in PR-9.

- crates/kebab-rag/src/pipeline.rs:
  - ask_multi_hop step 8.5 NLI hook (empty answer 가드 + truncate_for_nli + verifier.score + verification field + refusal 분기).
  - refuse_nli_verification helper (verification: Some(...) + RefusalReason::NliVerificationFailed).
  - refuse_nli_model_unavailable helper (verification: None + RefusalReason::NliModelUnavailable).
  - truncate_for_nli helper (module-level pub fn, MAX_NLI_PREMISE_CHARS = 4 * 400 = 1600 chars 의 chars-based budget, _hypothesis 미사용 placeholder — v0.18.1 token-budget 갱신 candidate).
  - PR-9c-1 의 #[allow(dead_code)] 두 곳 제거 (verifier field + with_verifier builder; doc 의 transitional sentence 도 정리). round-1 PR-9c-1 review N1 carry-forward closure.
- crates/kebab-app/src/app.rs:
  - App::open_with_config 의 NliVerifier construction — config.rag.nli_threshold > 0 → OnnxNliVerifier::new + Arc::new wrap + 후속 RagPipeline 초기화 시 with_verifier 호출. 실패 시 ? 전파 (시그니처 Result<Self> 그대로 — caller cascading 0).
  - kebab-app/Cargo.toml 에 kebab-nli path 의존 추가.
- crates/kebab-rag/tests/multi_hop.rs + tests/common/mod.rs:
  - MockNliVerifier (pass / fail / err 생성자 + score call_count instrumented).
  - multi_hop_nli_pass_keeps_grounded — entailment 0.9 → grounded=true, verification.nli_passed=true.
  - multi_hop_nli_fail_refuses — entailment 0.1 → refusal=NliVerificationFailed.
  - multi_hop_nli_disabled_skip_verify — threshold 0.0 → verify skip, verification=None.
  - multi_hop_nli_model_unavailable_refuses — verifier Err → refusal=NliModelUnavailable.
  - multi_hop_truncate_for_nli_preserves_hypothesis — long premise truncation + hypothesis 보전.
- integrations/claude-code/kebab/SKILL.md: mcp__kebab__ask 절에 NLI 안내 한 단락 (verification.nli_passed 의미 + threshold tuning + nli_verification_failed/nli_model_unavailable refusal handling).

검증: cargo test --workspace -j 1 — 5 신규 multi-hop pass + 회귀 0 (pre-existing kebab-mcp::tools_call_ask_multi_hop 동일 flaky). cargo clippy --workspace --all-targets -j 1 -- -D warnings clean.
Wire 영향: PR-9c-1 의 schema 변경에 *behavior wiring* — answer.v1.verification field 가 multi-hop happy path + refuse path 양쪽에서 채움.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 00:55:02 +00:00
8a2f7affa6 feat(mcp): fb-41 PR-5 — MCP ask multi_hop arg + SKILL.md 안내
fb-41 multi-hop RAG 의 **PR-5** (PR-4 머지 직후). PR-4 의 CLI `--multi-hop`
flag 와 sister surface — agent (Claude Code 등 MCP host) 가 `mcp__kebab__ask`
호출 시 `multi_hop: true` 옵션 사용 가능.

설계: docs/superpowers/specs/2026-05-25-p9-fb-41-multi-hop-rag-design.md
계획: docs/superpowers/plans/2026-05-25-p9-fb-41-multi-hop-rag.md (PR-5 단락)

## MCP surface

- `crates/kebab-mcp/src/tools/ask.rs`:
  - `AskInput.multi_hop: Option<bool>` 추가. JsonSchema derive 가 tools/list
    에 자동 반영 — agent capability discovery 가 새 필드 인식.
  - `handle()` 가 `AskOpts.multi_hop = input.multi_hop.unwrap_or(false)` —
    기존 caller (필드 누락 / null) 는 single-pass 그대로.
- `crates/kebab-mcp/src/lib.rs` (tools/list):
  - `ask` tool description 에 multi-hop 한 줄 (decompose → retrieve →
    synthesize, 2-5× LLM cost, per-hop trace on Answer.hops).

## SKILL.md 안내

- `integrations/claude-code/kebab/SKILL.md` 의 `mcp__kebab__ask` 절:
  - Input shape JSON 예제에 `multi_hop: false` 추가.
  - Returns 절에 `hops` (multi-hop only) 추가.
  - 신규 bullet (p9-fb-41) — opt-in 조건 / 비용 trade-off / 사용 케이스
    (compound questions / prereq chains / cross-doc reasoning) /
    `Answer.hops` 의 per-hop trace shape / `multi_hop_decompose_failed`
    refusal 처리.

## Tests (`tests/tools_call_ask_multi_hop.rs` 신규, 2 Ollama-free pins)

- `ask_tool_routes_multi_hop_true_to_decompose_first`: dispatch
  divergence 핀. invalid LLM endpoint (`http://127.0.0.1:1`,
  request_timeout_secs=2) 로 force unreachable. multi_hop=true 는
  decompose 먼저 호출 → `error.v1` (code=model_unreachable) /
  isError=true. multi_hop=false (single-pass) 는 empty KB 에서 retrieve
  먼저 → no LLM call → `answer.v1` grounded=false / isError=false. 두
  shape 의 분기가 dispatch 가 실제로 다른 path 로 라우팅됨의 증거.
- `ask_input_schema_advertises_multi_hop_field`: AskInput 의 JsonSchema
  가 `multi_hop` property 노출 — MCP host capability discovery
  (tools/list 의 input schema) 회귀 핀.

기존 `tools_call_ask.rs` 의 AskInput literal 도 `multi_hop: None`
추가 (struct field 추가에 따른 minimal cascade).

## 변경 없음

- `prompt_template_version` (`rag-multi-hop-v1`) — 그대로.
- TUI surface — PR-6 의 책임.
- error.v1 매핑 — PR-4 의 enum reservation 그대로 (no error_wire
  promotion).

## 검증

- `cargo test -p kebab-mcp -j 1` — 신규 tools_call_ask_multi_hop 2 +
  기존 ask / search / bulk_search / fetch / ingest / schema / doctor /
  tools_list / initialize 등 모두 통과 (회귀 없음).
- `cargo clippy -p kebab-mcp --all-targets -j 1 -- -D warnings` clean.
- 단일 crate 직렬 build (16 GB RAM 제약).

## 다음 PR

- PR-6: TUI Ask 패널 multi-hop toggle (F2 / Ctrl-T) + hop trace render +
  cheatsheet 갱신.
- v0.18.0 cut (PR-6 머지 후).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 09:06:28 +00:00
3114c31841 chore(search): PR #165 회차 1 리뷰 반영
- HOTFIXES test 카운트 표기 정정: `9 신규 / 갱신 unit test` 의 산수
  ambiguity → `9 unit test (8 갱신 + 1 신규) + 2 신규 통합 test = 11
  total` 로 명시.
- SKILL.md (Claude Code integration) 의 search 절에 column scoping +
  heading_path raw-mode escape hatch 안내 한 bullet 추가. 회차 1
  의 follow-up suggestion 반영 — heading 검색 의도 agent 가 새
  escape hatch `'heading_path : <token>'` 를 발견 가능.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 05:44:21 +00:00
8a68289499 docs(v0.17.0/A6): HANDOFF + HOTFIXES + README + SMOKE + SKILL — 한국어 trigram closure
- HOTFIXES: 새 2026-05-24 절 — v0.17.0 closure 영향 (한국어
  lexical 3-gram, 영어 substring 변경, BM25 분포, 디스크 용량,
  heading_path JSON 노이즈 관찰). 기존 2026-05-22 한국어 lexical
  항목의 Status / Next step 을 closure 표현으로 갱신.
- HANDOFF: 머지 후 발견 deviation 절에 2026-05-24 entry +
  기존 2026-05-22 항목을 closure cross-link 로 정리. P10
  백로그 한국어 tokenizer 항목  v0.17.0 + "다음 task 후보"
  follow-up 라인의 상태 갱신.
- README: 검색 명령 행에 trigram 동작 + hint + 디스크 용량 한 줄.
- SMOKE: 새 "한국어 trigram 검색 (v0.17.0)" 절 — 도그푸딩 query
  시퀀스 (충돌은 raw / 해시 충돌 multi-token / Rust 충돌은
  mixed / 충돌 2자 + stderr / --json hint 검증) + 영어 substring
  동작 변경 안내.
- SKILL.md: search 절에 hint 필드 안내 한 줄 — agent 가
  short query 케이스에서 같은 query 재시도 대신 사용자에게
  surface 하도록.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 11:54:44 +00:00
th-kim0823
441f1192ee docs(fb-42): wire schema + README + SMOKE + design + SKILL + INDEX
- Add bulk_search_item.v1 + bulk_search_response.v1 wire schemas
- Register both in WIRE_SCHEMAS const
- README: --bulk flag mention + MCP tool list 7→8 (bulk_search)
- SMOKE: bulk multi-query walkthrough (CLI + MCP equivalent)
- Design §2.2: Bulk multi-query (fb-42) subsection (additive minor)
- SKILL: mcp__kebab__bulk_search section + tool table row
- Task spec status open→completed, banner replaced
- INDEX: fb-42 row 머지 (rerank hint deferred)
- Fix: missed Capabilities {bulk_search} in cli wire.rs test (Task 7 leftover)
- Fix: missed tools.len() 7→8 in cli_mcp_smoke (Task 5 leftover)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:07:36 +09:00
th-kim0823
600c6182fc docs(fb-40): rag-v2 prompt + README + design + SKILL + INDEX
- README: [rag] prompt_template_version default rag-v2 + V2 강화 3 규칙
- design §7: rag-v2 본문 + V1 legacy note
- SKILL.md: mcp__kebab__ask 응답 행태 변화 안내
- task spec: status open → completed, design + plan 링크
- INDEX: fb-40  머지 (2026-05-10)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:37:28 +09:00
th-kim0823
c864bd007f docs(fb-38): wire schema + README + design + SKILL + INDEX 2026-05-10 18:21:55 +09:00
th-kim0823
a40593590b docs(fb-37): wire schema + README + SMOKE + INDEX + SKILL 2026-05-10 14:13:47 +09:00
th-kim0823
6e7446861b docs(fb-36): README + SMOKE + INDEX + skill notes
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 04:26:27 +09:00
th-kim0823
2a6b3dc7e6 docs(fb-35): README + SMOKE + INDEX + skill notes
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:21:35 +09:00
th-kim0823
e084b306e5 fix(fb-34): align next_cursor semantics with docs (PR #125 round 2)
Previous round-1 fix dropped the speculative cursor branch on
the truncated path, leaving a contradiction with the docs:
- snippet-only shrunk → cursor emitted (returned == k_effective)
- k-popped → cursor null (returned < k_effective)
But docs promised the opposite.

R2 resolution: emit cursor whenever more hits may be reachable
(either retriever filled the page OR budget popped hits — the
popped ones remain fetchable from offset+returned). Drop the
artificial "widen vs paginate" copy; truncated and next_cursor
are now independent signals — caller may do either or both.

Updates: app.rs::search_with_opts logic + SearchResponse doc +
schema description + SKILL.md two bullets + max_tokens=0 test
asserts cursor IS emitted on k-pop case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 21:07:04 +09:00
th-kim0823
f485608108 fix(fb-34): address PR #125 round 1 review
- error_wire: StructuredError wrapper preserves ErrorV1 through
  anyhow → classify pipeline. Adds downcast short-circuit so
  cursor::decode's typed code = "stale_cursor" reaches the wire
  instead of being string-formatted to code = "generic".
- app: search_with_opts now wraps cursor::decode error in
  StructuredError instead of anyhow! string format.
- test: error_wire pins both negative (bare anyhow → not
  stale_cursor) AND positive (StructuredError → stale_cursor)
  invariants. CLI integration test runs end-to-end and asserts
  error.v1.code on stderr.
- app: next_cursor only emitted on full-page (k-pop) path; drop
  speculative emit on snippet-only truncation that would point at
  a different page than the agent expected.
- cursor: differentiate malformed-base64 / malformed-payload /
  revision-mismatch error messages; all keep code = stale_cursor.
- test: cursor_rejected fixture uses .expect() to fail loud on
  cursor non-emission instead of silent skip.
- test: max_tokens=0 → 1-hit floor + truncated=true.
- docs: SKILL.md + schema description distinguish snippet-shrink
  (widen) vs k-pop (paginate) truncated cases. HOTFIXES notes
  --no-cache semantic shift (cached path + clear vs uncached path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:49:27 +09:00
th-kim0823
9f076003e2 docs(fb-34): README + SMOKE + INDEX + HOTFIXES + skill notes
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:20:58 +09:00
th-kim0823
e1c6b7055a docs(fb-33): README + SMOKE + INDEX + skill notes
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 15:22:09 +09:00
th-kim0823
1008bca342 docs(fb-32): README + SMOKE + INDEX + skill parsing tip
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 02:57:14 +09:00
th-kim0823
b20c1dd56a docs(skill): MCP-first guidance + CLI as fallback
Repo-shipped skill previously framed CLI as primary surface with MCP
as a separate "recommended" alternative. Recipes still used CLI calls
even though MCP server has been the recommended surface since v0.3.1.

Now: MCP tool catalog (6 tools as `mcp__kebab__<name>`) leads, recipes
call MCP tools directly, CLI documented as fallback for hosts without
MCP. Generic wording preserved (no user-specific trigger keywords).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 23:45:25 +09:00
th-kim0823
47bfd518c8 📝 docs: comprehensive MCP usage guide (fb-31)
신규 docs/mcp-usage.md (~280 line) — agent integration 의 종합 가이드:

- Quick start + `--config` thread 예시
- Host config 예시 (Claude Code / Cursor / OpenAI Agents / Copilot CLI)
- 6 tool catalog (search / ask / schema / doctor / ingest_file / ingest_stdin)
  각 tool 의 input shape, defaults, output 예시, "언제 사용", mutation
  주의사항.
- Troubleshooting — error.v1 의 7 code 별 조치 표 + grounded:false +
  doctor !ok + empty search + tool-not-found 시나리오.
- Multi-turn ask + session 관리 — session_id 명명, 새 session 시작
  시점, lifetime, single-shot vs session 비교.
- Performance / Security 절.

README.md 의 기존 MCP 절은 quick start 만 유지하고 docs/mcp-usage.md
링크. integrations/claude-code/kebab/SKILL.md 도 동일 cross-link.

agent 사용자 도그푸딩 후속 의견 — host-agnostic 가이드 + 명시적
troubleshooting 표 + multi-turn session 명명 컨벤션 부재 해소.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:53:59 +09:00
th-kim0823
dc24cb34b1 🚑 fix(fb-31): apply final review nits
- kebab-app: add #[doc(hidden)] to ingest_stdin_with_config (CLAUDE.md
  convention — all *_with_config functions should have this attribute;
  fb-31's first impl missed it on the second facade fn).
- SKILL.md: "Since v0.4.0" → "Since v0.3.1" (MCP shipped in fb-30
  release v0.3.1; the wrong version claim was introduced in fb-30 doc
  sync and carried forward into fb-31).
- tools_call_ingest_file: add idempotency test (second call with same
  content → unchanged=1, new=0). Spec called for two tests; first impl
  shipped only the happy path.

Version bump 0.3.1 → 0.3.2 deferred to separate `chore/bump-v0.3.2` PR
mirroring fb-27 + fb-30 precedent (commits 73f5d73 / 5495d96).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:38:19 +09:00
th-kim0823
345a4f363a 📝 docs: sync README / HANDOFF / CLAUDE / skill / design for fb-31
- README 명령 표 에 `kebab ingest-file` + `kebab ingest-stdin` 두 row + MCP tool list 4 → 6.
- HANDOFF post-도그푸딩 항목 한 줄.
- CLAUDE.md `_external/` 디렉토리 + naming convention 한 줄.
- integrations skill — Recipe D (agent fetched web doc) + MCP tool list 갱신.
- design §6.7 `_external/` subdirectory 절 신설.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:16:45 +09:00
th-kim0823
f758d51a01 📝 docs: sync README / HANDOFF / CLAUDE / skill / design for fb-30
- README 명령 표 에 `kebab mcp` 추가 + Claude Code MCP config 예시
- HANDOFF post-도그푸딩 항목 한 줄 (rmcp 1.6 + manual dispatch + error_wire promotion + ask/search spawn_blocking + capability flag flip 명시)
- CLAUDE.md facade 룰 의 UI crate 카테고리 에 `kebab-mcp` 추가
- integrations skill — MCP 사용 안내 (recommended over subprocess)
- design §10.2 MCP transport 절 신설

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:15:19 +09:00
th-kim0823
f7d6ea593e 📝 docs: fix capability flag names in skill + document not_indexed details deviation (fb-27)
- SKILL.md: `streaming_ingest` → `streaming_ask`, `multi_turn` → `rag_multi_turn` (capability name mismatch flagged in final review — agents following the example literal would read non-existent fields).
- HOTFIXES.md: add `not_indexed.details` to the interim wire shape deviations list — emit `{ expected, found }` only (spec literal `{ data_dir, expected, found }` not honored because NotIndexed signal carries one full path, not separate data_dir).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:58:17 +09:00
th-kim0823
bb7b1cec4b 📝 docs: sync README / HANDOFF / CLAUDE / skill / design for fb-27
- README 명령 표 에 `kebab schema` 추가
- HANDOFF post-도그푸딩 항목 한 줄
- CLAUDE.md wire schema 절 schema.v1 / error.v1 추가
- integrations skill — schema 활용 안내 (additive)
- design §10.1 capability matrix subsection 신설

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:37:40 +09:00
th-kim0823
e31871d03e feat(integrations): ship Claude Code skill for kebab
Add `integrations/claude-code/kebab/` — a Claude Code skill that calls
`kebab search --json` / `kebab ask --json` automatically when the user
asks an internal-context / runbook / wiki-lookup question. Install via
`cp -r integrations/claude-code/kebab ~/.claude/skills/` (or symlink for
in-place updates).

The shipped SKILL.md keeps its frontmatter `description` generic so the
trigger fires on any "internal / org-specific" cue — per-user keywords
(team / system / acronym) belong in a local override, not in the
repo-shipped frontmatter.

CLAUDE.md §Wire schema v1: add a paragraph documenting the
`integrations/<host>/` layout and the rule that a wire schema major bump
(v1→v2) updates each shipped integration in the same PR — same shape as
the version-cascade rule.

README.md §외부 AI 통합: replace the "~50줄 wrapper" prose with a direct
link to `integrations/claude-code/`, since the wrapper is now in-tree.
2026-05-06 18:00:53 +09:00