diff --git a/integrations/claude-code/kebab/SKILL.md b/integrations/claude-code/kebab/SKILL.md index edcd2aa..2d47638 100644 --- a/integrations/claude-code/kebab/SKILL.md +++ b/integrations/claude-code/kebab/SKILL.md @@ -61,6 +61,7 @@ Input: - When `truncated: true`, the budget loop modified the page (snippet shortening or k reduction). `next_cursor` is **independent** — non-null whenever more hits may be reachable. Caller may widen `max_tokens` (re-issue same query for fuller snippets / more hits per page) or follow `next_cursor` (advance through more hits) or both. Mismatched cursor (corpus_revision changed) returns `error.v1.code = stale_cursor` — re-issue the search to obtain a fresh one. - **`trace: true` (p9-fb-37)** — debug aid. Response carries an extra `trace` block: `lexical[]` + `vector[]` (pre-fusion candidates), `rrf_inputs[]` (RRF union before final cut), and `timing` (`lexical_ms`, `vector_ms`, `fusion_ms`, `total_ms`). Trace bypasses the search cache (always cold). Use sparingly — it bloats the wire response and is for diagnosing "why did this hit / not hit", not normal retrieval. - **`hint` (v0.17.0)** — optional advisory string on `search_response.v1`. Present only when the result is empty AND the trimmed query is shorter than the FTS5 trigram tokenizer's 3-char minimum. Surface it to the user instead of retrying the same short query. Korean lexical search benefits most from ≥3-char keywords (`충돌` zero-hit, `충돌은` substring-hit). Raw FTS5 mode (`'...'`) opts out — the user opted into FTS5 syntax. Vector / hybrid modes carry the field too but it's rarely triggered (semantic embeddings handle short queries). +- **Column scoping (post-v0.17.1 dogfood)** — default lexical / hybrid matching is scoped to the `text` column only. The `heading_path` column is indexed (path segments like `app`, `src` plus JSON punctuation are 3-gram'd) but excluded from the default MATCH — past JSON noise produced false positives where a query coincidentally shared a 3-gram with a file's heading path. To deliberately search heading paths, escape into raw FTS5 mode with an explicit column filter: `'heading_path : '` (e.g. `kebab search "'heading_path : agent'"`). Same applies to MCP `search` — quote the inner expression. Raw mode bypasses both column scoping and the 3-char `hint` short-circuit. ### `mcp__kebab__bulk_search` diff --git a/tasks/HOTFIXES.md b/tasks/HOTFIXES.md index 63498bd..2d24634 100644 --- a/tasks/HOTFIXES.md +++ b/tasks/HOTFIXES.md @@ -56,11 +56,12 @@ v0.17.0 의 한국어 trigram tokenizer 채택 entry (2026-05-24 위) 가 미수 **변경**: - `crates/kebab-search/src/lexical.rs::build_match_string` 가 non-raw 분기에서 combined expression 을 `text : ()` 로 wrap. FTS5 column filter syntax (`column:expr`) 가 OR/AND sub-expression 허용 — 한국어 trigram 빌더의 `(whole) OR (token_and)` 형태가 그대로 들어감. - Raw mode (`'...'`) 는 변경 없음 — 사용자가 명시 의도로 `'heading_path : agent'` 같은 explicit column filter opt-in 가능 (escape hatch). -- 9 신규 / 갱신 unit test: +- 9 unit test (8 갱신 + 1 신규) + 2 신규 통합 test (`crates/kebab-search/tests/lexical.rs`) = 11 total: - `build_match_string_*` 8 expected string 갱신 (column filter prefix 추가) - - `build_match_string_raw_mode_preserves_heading_filter` 신규 — raw mode 가 `heading_path : ...` 보존 - - `lexical_heading_only_token_does_not_hit_default_mode` 신규 (`crates/kebab-search/tests/lexical.rs`) — heading-only unique token 이 default mode 에서 0 hit - - `lexical_raw_mode_can_opt_into_heading_path_filter` 신규 — 같은 fixture 가 raw mode 로 hit 확인 + - `build_match_string_raw_mode_preserves_heading_filter` 신규 unit — raw mode 가 `heading_path : ...` 보존 + - `lexical_heading_only_token_does_not_hit_default_mode` 신규 통합 — heading-only unique token 이 default mode 에서 0 hit + - `lexical_raw_mode_can_opt_into_heading_path_filter` 신규 통합 — 같은 fixture 가 raw mode 로 hit 확인 +- `integrations/claude-code/kebab/SKILL.md` 의 search 절에 column scoping + heading_path raw-mode escape hatch 안내 한 bullet 추가 (회차 1 follow-up suggestion 반영, 본 PR 에 포함). **사용자 영향**: - 기본 lexical / hybrid 검색에서 heading 만 매칭되던 false positive 차단. 한국어 / 영어 substring 매칭의 recall 은 그대로 (text 본문에 있는 token 은 변함없이 hit). 본문 검색의 precision 가 올라감.