docs(v0.17.0/A6): HANDOFF + HOTFIXES + README + SMOKE + SKILL — 한국어 trigram closure
- HOTFIXES: 새 2026-05-24 절 — v0.17.0 closure 영향 (한국어 lexical 3-gram, 영어 substring 변경, BM25 분포, 디스크 용량, heading_path JSON 노이즈 관찰). 기존 2026-05-22 한국어 lexical 항목의 Status / Next step 을 closure 표현으로 갱신. - HANDOFF: 머지 후 발견 deviation 절에 2026-05-24 entry + 기존 2026-05-22 항목을 closure cross-link 로 정리. P10 백로그 한국어 tokenizer 항목 ✅ v0.17.0 + "다음 task 후보" follow-up 라인의 상태 갱신. - README: 검색 명령 행에 trigram 동작 + hint + 디스크 용량 한 줄. - SMOKE: 새 "한국어 trigram 검색 (v0.17.0)" 절 — 도그푸딩 query 시퀀스 (충돌은 raw / 해시 충돌 multi-token / Rust 충돌은 mixed / 충돌 2자 + stderr / --json hint 검증) + 영어 substring 동작 변경 안내. - SKILL.md: search 절에 hint 필드 안내 한 줄 — agent 가 short query 케이스에서 같은 query 재시도 대신 사용자에게 surface 하도록. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -60,6 +60,7 @@ Input:
|
||||
- Cite back to the user as `doc_path § heading_path[-1]` so they can open the source.
|
||||
- When `truncated: true`, the budget loop modified the page (snippet shortening or k reduction). `next_cursor` is **independent** — non-null whenever more hits may be reachable. Caller may widen `max_tokens` (re-issue same query for fuller snippets / more hits per page) or follow `next_cursor` (advance through more hits) or both. Mismatched cursor (corpus_revision changed) returns `error.v1.code = stale_cursor` — re-issue the search to obtain a fresh one.
|
||||
- **`trace: true` (p9-fb-37)** — debug aid. Response carries an extra `trace` block: `lexical[]` + `vector[]` (pre-fusion candidates), `rrf_inputs[]` (RRF union before final cut), and `timing` (`lexical_ms`, `vector_ms`, `fusion_ms`, `total_ms`). Trace bypasses the search cache (always cold). Use sparingly — it bloats the wire response and is for diagnosing "why did this hit / not hit", not normal retrieval.
|
||||
- **`hint` (v0.17.0)** — optional advisory string on `search_response.v1`. Present only when the result is empty AND the trimmed query is shorter than the FTS5 trigram tokenizer's 3-char minimum. Surface it to the user instead of retrying the same short query. Korean lexical search benefits most from ≥3-char keywords (`충돌` zero-hit, `충돌은` substring-hit). Raw FTS5 mode (`'...'`) opts out — the user opted into FTS5 syntax. Vector / hybrid modes carry the field too but it's rarely triggered (semantic embeddings handle short queries).
|
||||
|
||||
### `mcp__kebab__bulk_search`
|
||||
|
||||
|
||||
Reference in New Issue
Block a user