8dcedc4b11
feat(p10-r2): V007 trigram migration + design §5.5 + fts diff-check
...
Task A2 + A3 한 묶음.
migrations/V007__fts_trigram.sql 신규:
- chunks_fts shadow 를 DROP + 재생성 (tokenize = trigram).
- chunks_ai/ad/au trigger 재생성 (V002 와 동일).
- chunks 에서 backfill INSERT — 사용자 re-ingest 불필요, V007 자동.
- V002 는 historical cold-upgrade replay 위해 그대로 유지.
design §5.5 갱신:
- verbatim block 의 tokenize 만 trigram 으로 교체.
- §5.5 본문 상단에 한국어 채택 사유 + trade-off (영어 lexical 변경,
BM25 분포, 디스크 ~2-10x, contentless 아님) prose 한 단락 추가.
crates/kebab-store-sqlite/tests/fts.rs:
- fts_v002_matches_design_section_5_5_verbatim →
fts_v007_matches_design_section_5_5_verbatim 으로 rename.
- extract_migration_5_5_verbatim_block() 의 include_str! path 를
V007__fts_trigram.sql 로 변경. 주석/assertion msg V007 로.
- V002 cold-upgrade test 들 (fts_v002_backfill_*) 은 그대로 유지.
검증: cargo test -p kebab-store-sqlite --test fts → 10/10 PASS
(`fts_v007_matches_design_section_5_5_verbatim` 포함).
Codex round 1/2 의 design §5.5 contentless 정정·trigram tokenizer
채택 사유 명시 발견 반영.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-23 00:52:40 +00:00
8781c6112b
task(A1): builder baseline + sqlite version + snapshot locations
...
Task A1 step 1-3 완료. plan A5 의 baseline 노트 슬롯 채움.
핵심 발견:
- build_match_string() (lexical.rs:177-200): trim → strip_single_quotes
raw FTS verbatim / 그 외 whitespace split + escape_fts5_token (\"...\"
+ inner doubling) + space join (implicit AND).
- raw mode = single quote '...' 가 trimmed 전체 감쌈 (lexical.rs:167).
- SQLite: rusqlite 0.32 + libsqlite3-sys 0.30.1 bundled (in-tree, SQLite
~3.46.x) → trigram 사용 가능.
- Snapshot: tests/lexical.rs::lexical_snapshot_run_1 + tests/hybrid.rs::
hybrid_snapshot_run_1 (KEBAB_UPDATE_SNAPSHOTS=1 로 regenerate).
inline normalize_bm25_top_score 는 numerical 무관.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-23 00:47:24 +00:00
584247f1ea
spec+plan(v0.17.0): korean trigram tokenizer + dogfood fixes
...
P10 도그푸딩 round 2 (2026-05-22) follow-up. SQLite FTS5 tokenizer
unicode61 → trigram 으로 교체해 한국어 lexical 검색 지원 + 작은
버그픽스 2 (C typedef-wrapped struct 미노출, code_lang_breakdown
집계 단위).
Codex + Gemini round 1/2/3 리뷰 반영:
- [r1] 2자 한국어 query 0-hit, build_match_string() multi-token 깨짐,
contentless → shadow, parser_version cascade, BM25/heading_path/디스크
- [r2] same-workspace_path orphan purge (parser bump cascade 실제 동작),
trigram 테스트 예시 sqlite 3.45.1 검증, builder 권장안 (whole phrase OR)
- [r3] SMOKE 시나리오 정정, TUI stale hint 방지, search_response.v1 hint
필드, new purge helpers, single quote raw mode 통일, fixture 도입
PR 구성: PR-A (trigram + builder + 안내), PR-B (C typedef + orphan
purge), PR-C (stats + wire). 셋 머지 후 v0.17.0 release cut.
design: docs/superpowers/specs/2026-05-22-korean-trigram-tokenizer-design.md
plan: docs/superpowers/plans/2026-05-22-korean-trigram-tokenizer.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-23 00:43:31 +00:00
438870ee25
docs(p10-1d): activate C + C++ in frozen design §10
...
P10 Tier 1 chunker family complete (Rust + Python + TS + JS + Go + Java +
Kotlin + C + C++).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-21 14:32:26 +00:00
a58d400abd
docs(p10-1d): implementation plan (11 tasks A-K, subagent-driven)
...
Tasks: workspace deps / C extractor / C++ extractor / C chunker + snapshot /
C++ chunker + snapshot / ingest dispatch + tier3_fallback_cv extension /
2 smoke tests / frozen design §10 / docs sync / workspace test gate /
version bump 0.15.0 → 0.16.0 + gitea PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-21 13:15:22 +00:00
6bc7a83d3c
docs(p10-3): activate Tier 3 in frozen design §10.1
...
Add p10-3 activation log entry for Tier 3 paragraph fallback chunker
(code-text-paragraph-v1) with shell direct routing and fallback wrapper
for invalid YAML / AST failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-21 11:39:49 +00:00
a8aa03042f
docs(p10-3): implementation plan (9 tasks A-I, subagent-driven)
...
Tasks: tier2_shared visibility upgrade / Tier 3 chunker + 4 unit tests /
shell direct routing / Tier 1/2 fallback wrapper / 2 smoke tests / frozen
design §10.1+§10 / docs sync (6 files) / workspace test gate / version
bump 0.14.0→0.15.0 + gitea PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-21 11:16:55 +00:00
522ae7b8bc
docs(p10-2): activate Tier 2 in code-ingest design §10.1 + §3.5 mappings
...
§3.5: add code_lang_for_path mappings xml / groovy / go-mod.
§10.1: add deactivation log entry for p10-2 (3 Tier 2 chunkers active).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-20 13:24:16 +00:00
5ce7f60932
docs(p10-2): implementation plan (11 tasks A-K, subagent-driven)
...
Branch feat/p10-2-tier2-resource. Tasks: serde_yaml dep / lang.rs basenames /
media.rs source-of-truth consolidation / 3 chunkers (k8s + dockerfile +
manifest) + tier2_shared helper / ingest dispatch / smoke tests / frozen
design §3.5+§10.1 / docs sync / version bump 0.13.0→0.14.0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-20 12:55:36 +00:00
2d7a566624
docs(p10-1c-jk): README/HANDOFF/ARCHITECTURE/SMOKE/INDEX + design §10.1; chore: bump version 0.12.0 → 0.13.0
...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-20 11:38:40 +00:00
1b19e33a4f
docs(p10-1c-jk): task spec + implementation plan
...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-20 10:27:13 +00:00
f95cd55484
docs(p10-1c-go): README/HANDOFF/ARCHITECTURE/SMOKE/INDEX + design §10.1; chore: bump version 0.11.1 → 0.12.0
...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-20 10:02:21 +00:00
8b89961ada
docs(p10-1c-go): task spec + implementation plan
...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-20 08:58:45 +00:00
44813df052
docs(p10-1b): README/HANDOFF/ARCHITECTURE/SMOKE/INDEX + HOTFIXES; chore: bump version 0.7.0 → 0.8.0
...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-20 01:48:06 +00:00
39b766ea59
docs(p10-1b): task spec + implementation plan
...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-20 00:26:58 +00:00
80c2d31fb3
docs(p10-1a-2): README/HANDOFF/ARCHITECTURE/SMOKE/INDEX + HOTFIXES; chore: bump version 0.6.0 → 0.7.0
...
- README: note Rust .rs ingest active (code-rust-ast-v1), update Mermaid parse node + chunker labels, update supported formats note in Quick start and ingest command table; add code citation fields (symbol, code_lang, repo) and filter flags note
- HANDOFF: flip P10 row to note 1A-1 ✅ + 1A-2 PR open; add one-liner cross-link to HOTFIXES 2026-05-19 entries
- ARCHITECTURE: add kebab-parse-code node + edge (app → pcode, pcode → ptypes) to Mermaid graph; add directory tree entry; add code parser locked-in decision row (tree-sitter lives parser-side, design §6.3)
- SMOKE: add P10-1A-2 Rust code ingest section (ingest.code config keys, verification steps, known behaviors); add checklist item
- tasks/INDEX.md: flip p10-1A-1 to ✅ , update p10-1A-2 to 🟡 PR open
- tasks/p10/INDEX.md: same flips
- tasks/HOTFIXES.md: add two 2026-05-19 dated entries (AST_CHUNK_MAX_LINES constant vs config deviation + SourceType::Code deferred)
- tasks/p10/p10-1a-2-rust-ast-chunker.md: append two HOTFIXES cross-link lines in Risks/notes
- docs/superpowers/specs/2026-04-27-kebab-final-form-design.md §10.1: note p10-1A-2 surface activation
- Cargo.toml: version 0.6.0 → 0.7.0 (dogfooding-ready = minor bump trigger per CLAUDE.md)
- Cargo.lock: regenerated
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-19 22:48:11 +00:00
7a6a24ad10
feat(p10-1a-2): add MediaType::Code(lang) variant
...
TDD: red → green cycle confirmed. New `Code(String)` variant serializes
as `{"code":"rust"}` via serde `rename_all = "lowercase"`. All exhaustive
`match` sites updated (`media_label`, `ingest_one_asset` catch-all →
explicit or-pattern). Design §3.5 enum listing synced. Also fix
`/target` symlink gitignore pattern so integration-test binary lookup
via workspace-relative path works with CARGO_TARGET_DIR redirect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-19 17:14:45 +00:00
9f3edb7e24
feat(p10-1a-2): add internal SourceSpan::Code variant + design §3.4 sync
...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-19 15:52:01 +00:00
a08ed32199
docs(p10-1a-2): task spec + implementation plan
...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-19 15:36:08 +00:00
th-kim0823
7961f8813d
fix(p10-1a-1): PR review round 1 — doc inconsistencies
...
회차 1 review 의 4 건 actionable 모두 반영:
1. frozen design §2.1 의 code variant 예시에서 존재하지 않는 `repo` 필드 제거 + nested form 에서 actual wire (flat) 형태로 정리. 5 variant 의 nested-form illustrative example 은 그대로 두고, code variant 만 별도 block 으로 분리해서 actual wire 와 1:1 매칭. 또 위쪽 6 variant nested-form group 에서도 'code' 행 삭제 (정확한 contract 는 별도 block 에 있음).
2. §2.2 SearchHit 예시의 `repo: null, code_lang: null` + 'omitted when null' 주석 모순 제거 — 키 자체를 빼고 inline 주석으로 'markdown hit 에는 absent, 코드 hit 에서만 surface' 설명.
3. HANDOFF Phase row 식별자 `**10**` → `**P10**` (다른 row 와 일관성).
4. README synopsis 의 중복 `[--media code]` 제거 (`--media` 는 이미 위쪽에 한 번 있음, code 는 값 중 하나라 prose 에서 설명).
코드 변경 없음 — 모두 markdown 문서.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-15 18:24:15 +09:00
th-kim0823
7bbd2c0cbf
docs(p10-1a-1): wire schema + frozen design + README/HANDOFF/SMOKE + task index
...
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-15 17:41:26 +09:00
th-kim0823
005a9011ea
plan(p10-1a-1): code ingest framework implementation plan + spec wire-shape fix
...
21 task plan: kebab-core 도메인 타입 (Citation::Code variant, SearchHit repo/code_lang, IngestReport skip counters, Metadata extension), 새 kebab-parse-code crate (lang/repo/skip 모듈, gix dep), kebab-source-fs gitignore+blacklist 통합, kebab-config [ingest.code] 절, kebab-cli --repo/--code-lang flag, wire schema JSON 갱신, frozen design doc 갱신, README/HANDOFF/SMOKE 갱신, task index. 각 task 가 5-step TDD cycle (test fail → impl → pass → commit). 코드 chunker 는 1A-1 에 없음 — 1A-2 에서 추가.
spec 의 Citation::Code 예시가 기존 5 variants 의 flat wire 형태와 안 맞아서 (`code: {...}` 중첩이 아니라 top-level field) 같이 fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-15 14:31:22 +09:00
th-kim0823
c6d61b0b37
spec(p10): split Phase 1A into 1A-1 (framework) and 1A-2 (Rust chunker)
...
1A 가 들고 들어가는 *프레임워크 surface* (Citation `code` variant, SearchHit repo/code_lang, --media code / --code-lang / --repo filter, skip 정책, IngestReport 세분화, config 절, kebab-parse-code crate skeleton) 가 *언어 chunker 자체* 와 독립 검증 가능 — 1A-1 머지 후 기존 markdown corpus 의 wire 출력이 byte-level identical 한지 regression test 로 검증한 다음 1A-2 에서 Rust AST chunker 자체에 집중. binary version bump 트리거도 1A-2 로 미룸 (1A-1 은 wire additive minor + 사용자 surface 변경 없음).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-15 14:20:10 +09:00
th-kim0823
49487dc46b
spec(p10): code ingest design — Tier 1 AST + Tier 2 resource + Tier 3 fallback
...
수십 개 git repo (한 부모 dir 아래) 를 corpus 로 확장. Tier 1 (Rust/Python/TS-JS/Go/Java/Kotlin/C/C++) 은 tree-sitter AST per-language chunker, Tier 2 (k8s manifest / Dockerfile / Cargo.toml 류) 는 resource-aware chunker, Tier 3 (shell / fallback) 는 paragraph + line-window. embedding 은 multilingual-e5-large 유지 — cross-corpus 검색 위해. Phase 1A (Rust) 부터 1D (C/C++) + Phase 2 (Tier 2) + Phase 3 (Tier 3) 순으로 진행. ignore 통합 (.gitignore honor + .kebabignore 추가 + 최소 built-in safety net), generated header sniff, size cap 으로 첫 도그푸딩 비용 차단. 새 Citation variant `code`, SearchHit 의 repo/code_lang 필드, --media code / --code-lang / --repo filter — 모두 additive minor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-15 14:15:59 +09:00
th-kim0823
c62a8ff503
docs(fb-39b): design + HOTFIXES + new task spec + INDEX + README + SMOKE
...
Tasks 4 + 5: comprehensive doc update for embedding upgrade (multilingual-e5-large).
- design §5 + §9: update embedding_model / dimensions references (384 -> 1024)
- HOTFIXES: add fb-39b entry with user re-ingest procedure + backwards-compat notes
- tasks/p9-fb-39b-embedding-upgrade.md: new task spec (completed status)
- INDEX.md: add fb-39b row under RAG quality phase
- fb-39 task banner: append fb-39b link as lever implementation
- README: update config defaults + fastembed model size + embedding field docs
- SMOKE.md: append embedding upgrade verification section with e5-small -> e5-large sequence
Wire schema: no change (additive at config level, new table created by existing code).
Binary version: 0.6.0 -> 0.7.0 (cascade rule: embedding_model change = minor bump).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 23:28:48 +09:00
th-kim0823
d5321701ea
plan(fb-39b): embedding upgrade implementation plan
...
5 tasks: kebab-embed-local resolve_model arm + check_dim test,
kebab-config defaults + TOML template flip, cross-crate fixture
sweep (likely no-op since most tests use provider=none), docs
(design + HOTFIXES + new task spec + INDEX), README + SMOKE
walkthrough.
Post-merge: 0.6 → 0.7 binary bump per CLAUDE.md cascade rule.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 23:02:37 +09:00
th-kim0823
2c3461c465
spec(fb-39b): embedding model upgrade design
...
- multilingual-e5-small (384 dim) → multilingual-e5-large (1024 dim)
- Cascade: embedding_version bump → fb-23 incremental ingest
re-embeds all chunks
- Migration policy: dim mismatch detection at LanceVectorStore::open
→ error.v1 (code = embedding_dim_mismatch) + hint
"kebab reset --vector-only && kebab ingest"
- Config defaults flip (model + dimensions). User TOML pinning small
preserves backwards-compat
- bge-m3 deferred (fastembed enum 미포함, UserDefinedEmbeddingModel
ONNX path 별도)
- Release trigger: 0.6 → 0.7 minor bump per CLAUDE.md cascade rule
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 22:59:03 +09:00
th-kim0823
f00fb376fe
docs(fb-39): golden header + design §10.3 eval + spec status + INDEX
...
Strengthen fixtures/golden_queries.yaml header with precision_at_k_chunk
explanation + measurement guidance. Add §10.3 Eval metrics section to
frozen design documenting retrieval metrics (hit@k, MRR, recall@k_doc,
P@k_chunk) + groundedness metrics. Flip p9-fb-39 spec status from open
→ completed (eval foundation only, lever deferral noted). Update
tasks/INDEX.md fb-39 row mirror to fb-42 (merged, deferred note).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 22:35:15 +09:00
th-kim0823
f303c76f52
plan(fb-39): eval foundation implementation plan
...
4 tasks: AggregateMetrics.precision_at_k_chunk field + serde
backwards-compat, compute aggregation in loop with 5 unit tests,
golden YAML header doc strengthening, design §11 + INDEX + status
flip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 22:19:44 +09:00
th-kim0823
cd5b1e3bfc
spec(fb-39): eval foundation design (P@k metric)
...
- AggregateMetrics 에 precision_at_k_chunk: BTreeMap<u32, f32>
(P@5, P@10) 추가, binary relevance via expected_chunk_ids
- Denominator = k 고정 (hits.len() < k 도 precision 손실 간주)
- Empty expected_chunk_ids query 는 skip (hit_at_k 동일 정책)
- Lever 적용 (chunk policy / RRF / cross-encoder / embedding) 은
본 spec 범위 외 — fb-39b 이후 별도 task
- Golden set schema 무변경, shipped fixtures 헤더 주석만 강화
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 22:05:09 +09:00
th-kim0823
441f1192ee
docs(fb-42): wire schema + README + SMOKE + design + SKILL + INDEX
...
- Add bulk_search_item.v1 + bulk_search_response.v1 wire schemas
- Register both in WIRE_SCHEMAS const
- README: --bulk flag mention + MCP tool list 7→8 (bulk_search)
- SMOKE: bulk multi-query walkthrough (CLI + MCP equivalent)
- Design §2.2: Bulk multi-query (fb-42) subsection (additive minor)
- SKILL: mcp__kebab__bulk_search section + tool table row
- Task spec status open→completed, banner replaced
- INDEX: fb-42 row 머지 (rerank hint deferred)
- Fix: missed Capabilities {bulk_search} in cli wire.rs test (Task 7 leftover)
- Fix: missed tools.len() 7→8 in cli_mcp_smoke (Task 5 leftover)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 21:07:36 +09:00
th-kim0823
de9016fe16
plan(fb-42): bulk multi-query implementation plan
...
8 tasks: kebab-core types, kebab-app bulk_search_with_config facade
(cap 100 + per-query error policy), CLI --bulk flag + stdin ndjson +
output stream, CLI integration tests, MCP bulk_search tool +
registration + tools_list count bump, MCP integration tests,
capability flag, wire schemas + README + SMOKE + design + SKILL +
status flip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 20:10:39 +09:00
th-kim0823
35df15df99
spec(fb-42): bulk multi-query design (rerank hint deferred)
...
- CLI: kebab search --bulk + stdin ndjson → stdout per-query ndjson
- MCP: 신규 kebab__bulk_search tool + JSON envelope (results + summary)
- Sequential for-loop, App instance 재사용 (cache amortize)
- Per-query error policy: continue + per-item error.v1
- Limits: queries.len() <= 100
- Capability flag bulk_search 신규
- Rerank hint 별도 task (fb-39 cross-encoder 설계 후)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 20:05:27 +09:00
th-kim0823
600c6182fc
docs(fb-40): rag-v2 prompt + README + design + SKILL + INDEX
...
- README: [rag] prompt_template_version default rag-v2 + V2 강화 3 규칙
- design §7: rag-v2 본문 + V1 legacy note
- SKILL.md: mcp__kebab__ask 응답 행태 변화 안내
- task spec: status open → completed, design + plan 링크
- INDEX: fb-40 ✅ 머지 (2026-05-10)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 19:37:28 +09:00
th-kim0823
6d6eb442be
plan(fb-40): fact-grounded answer implementation plan
...
6 tasks: SYSTEM_PROMPT_RAG_V2 + system_prompt_for helper, pipeline
dispatch wiring, config default flip rag-v1 → rag-v2, test fixture
cleanup, integration tests (rag-v1 / rag-v2 / unknown via
CapturingLm wrapper around MockLanguageModel), docs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 18:58:35 +09:00
th-kim0823
28d3250546
spec(fb-40): fact-grounded answer design
...
- rag-v1 → rag-v2 system prompt with 3 신규 규칙 (verbatim span 인용 자도 /
학습 지식 동원 금지 / 추측 금지)
- system_prompt_for(version) helper dispatch in pipeline
- config default prompt_template_version "rag-v1" → "rag-v2", V1 legacy
kept for backwards-compat
- Lever C (pre-LLM gate) already shipped (RefusalReason::ScoreGate),
out of scope here
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 18:55:05 +09:00
th-kim0823
c864bd007f
docs(fb-38): wire schema + README + design + SKILL + INDEX
2026-05-10 18:21:55 +09:00
th-kim0823
56f20b7235
plan(fb-38): score semantics implementation plan
...
7 tasks: kebab-core ScoreKind enum + SearchHit field, lexical Bm25
labeling, vector Cosine, hybrid Rrf + search_with_trace pass-through,
cross-crate SearchHit literal cleanup, CLI integration test, docs
(wire schema + README + design + SKILL + INDEX).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 17:45:57 +09:00
th-kim0823
0359bd9682
spec(fb-38): score semantics design
...
- search_hit.v1 에 optional score_kind 필드 (rrf | bm25 | cosine)
- LexicalRetriever → Bm25, VectorRetriever → Cosine, HybridRetriever → Rrf
- fb-37 search_with_trace 의 mode-dispatch hits 는 underlying retriever 의
score_kind 그대로 보존
- README + design §4 + SKILL 에 RRF 수식 전체 + "ranking signal, NOT confidence"
안내, agent 용 trust threshold 는 nested retrieval.{lexical,vector}_score
- additive minor wire — schema bump 없음
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 17:40:47 +09:00
th-kim0823
fb31befef1
plan(fb-37): trace + stats implementation plan
...
10 tasks: kebab-core types, store breakdowns/index_bytes helpers,
extended CountSummary + Stats wire mirror, HybridRetriever
search_with_trace, App SearchResponse.trace threading, CLI --trace
flag, integration tests, MCP SearchInput.trace, TUI TracePopup,
docs (wire schema + README + SMOKE + INDEX + SKILL).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 12:14:26 +09:00
th-kim0823
5f6b2fa259
spec(fb-37): trace + stats design
...
- search --trace boolean flag, additive optional `trace` field on search_response.v1
- HybridRetriever search_with_trace returns (hits, SearchTrace) — lex/vec/rrf_inputs + per-stage timing
- cache bypass when --trace (debug intent)
- schema.v1.stats extended with media_breakdown / lang_breakdown / index_bytes / stale_doc_count
- TUI search pane `t` keystroke opens TracePopup
- additive minor wire — no schema bump
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 12:05:31 +09:00
th-kim0823
31c1e05951
plan(fb-36): search filter args implementation plan
...
9 tasks: SearchFilters extension, lexical SQL WHERE, vector
filter_chunks mirror, CLI 7 flags, integration tests, MCP
SearchInput extension, workspace test/clippy, docs, smoke+PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 03:34:39 +09:00
th-kim0823
7210386699
spec(fb-36): search filter args — design
...
`kebab search` 에 7 flag 노출 (기존 4 + 신규 3):
- --tag (반복) / --lang / --path-glob / --trust-min (기존 SearchFilters)
- --media (csv) / --ingested-after (RFC3339) / --doc-id (신규)
filter layer = SQLite WHERE (lexical) + over-fetch+post-filter
(vector). AND 결합. wire schema 무변경 (input only).
`SearchFilters` 3 필드 additive (#[serde(default)] 로 backwards-
compat). MCP SearchInput 7 optional 필드 추가. invalid RFC3339 →
error.v1.code = config_invalid.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 03:26:40 +09:00
th-kim0823
7dddc1d706
fix(fb-35): address PR #126 round 1 review
...
- fetch_span: panic-fix on line_start > total / empty doc
(return empty text + effective_end = line_start - 1 instead of
out-of-bounds slice)
- truncated: reserved for budget-driven truncation only; line
range clamp signaled via effective_end < line_end
- spec / SKILL.md / README: align rejection wording to "PDF /
audio" (matches code; Image OCR allowed for span)
- store: warning comment on list_chunk_ids_for_doc — chunk_id
hash sort does NOT preserve document position; real fix is a
chunks.ordinal column, tracked as follow-up
- surrounding_chunks: saturating_add to defend against u32::MAX
context arg on 32-bit targets
- tests: line_start > total returns empty + chunk context at
doc boundary clamps lower bound
Deferred nits (follow-up): table-separator strict CommonMark form;
MCP per-mode strict validation; CLI chunk_id truncation in plain
output. None block correctness.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-10 00:45:29 +09:00
th-kim0823
353aa5cc78
plan(fb-35): verbatim fetch implementation plan
...
11 tasks: domain types, wire schema, App::fetch chunk/doc/span
modes (3 separate tasks for incremental TDD), CLI subcommand,
CLI integration tests, MCP tool, workspace+clippy gate, docs,
smoke+PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-09 23:31:29 +09:00
th-kim0823
4eda9c317d
spec(fb-35): verbatim fetch — design
...
`kebab fetch chunk|doc|span` 신규 subcommand + MCP `kebab__fetch`
tool. wire = `fetch_result.v1` (kind discriminator).
source = CanonicalDocument / chunks.text 정규화된 markdown (raw
bytes 미노출). chunk mode `--context N` = ordinal ±N. doc/span
mode = fb-34 budget 재사용 (chars/4). PDF/audio span 은
`error.v1.code = span_not_supported` 거절.
신규 error codes: chunk_not_found / doc_not_found /
span_not_supported / invalid_input. fb-34 StructuredError
wrapper 재사용.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-09 23:21:01 +09:00
th-kim0823
dbb7b54d5d
plan(fb-34): output budget controls implementation plan
...
11 tasks: SearchOpts (kebab-core), cursor module + base64 dep
(kebab-app), error_wire stale_cursor convention, App::search_with_opts
+ SearchResponse + budget loop, wire schema search_response.v1, CLI
flags + plain truncated hint, CLI integration tests, MCP wrapper +
inputs, workspace+clippy gate, docs (README/SMOKE/INDEX/HOTFIXES/
skill), smoke+PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-09 17:43:26 +09:00
th-kim0823
a80f65c6f2
spec(fb-34): output budget controls — design
...
`kebab search` 에 --max-tokens / --snippet-chars / --cursor 신규.
chars/4 token approximation. truncate priority: snippet → k → 멈춤
(최소 1 hit 보장). cursor = opaque base64(offset + corpus_revision)
— mismatch 시 error.v1.code = stale_cursor.
wire breaking: stdout array → search_response.v1 wrapper. agent 갱신
필요. App::search 시그니처는 thin wrapper 로 보존 (TUI 무영향).
ask path 는 scope out (rag.max_context_tokens 가 이미 budget 담당).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-09 17:36:51 +09:00
th-kim0823
0ca9b1d5c3
plan(fb-33): streaming ask implementation plan
...
10 tasks: StreamEvent enum + AskOpts switch (kebab-core), pipeline
emits + cancel branch (kebab-rag), kebab-app re-exports, TUI
worker adapt, wire schema answer_event.v1, CLI --stream flag +
ndjson stderr driver + BrokenPipe cancel, integration tests
(Ollama-gated), workspace+clippy gate, docs, smoke+PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-09 14:16:42 +09:00
th-kim0823
4949775c8b
spec(fb-33): streaming ask (ndjson delta) — design
...
3-variant StreamEvent enum (RetrievalDone / Token / Final) 을 통해
RagPipeline 이 retrieval / per-token / final 단계를 sink 로 발사.
CLI `kebab ask --stream` 이 ndjson event 를 stderr 로 흘리고 final
stdout line 은 기존 answer.v1 그대로 (ingest_progress.v1 패턴).
Cancel = stdout 닫힘 → SendError → LLM stream break +
RefusalReason::LlmStreamAborted 로 partial answer 기록.
MCP streaming 은 v0.5+ 별도 검토 (scope out).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-09 14:10:08 +09:00