Commit Graph

52 Commits

Author SHA1 Message Date
6bc7a83d3c docs(p10-3): activate Tier 3 in frozen design §10.1
Add p10-3 activation log entry for Tier 3 paragraph fallback chunker
(code-text-paragraph-v1) with shell direct routing and fallback wrapper
for invalid YAML / AST failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 11:39:49 +00:00
522ae7b8bc docs(p10-2): activate Tier 2 in code-ingest design §10.1 + §3.5 mappings
§3.5: add code_lang_for_path mappings xml / groovy / go-mod.
§10.1: add deactivation log entry for p10-2 (3 Tier 2 chunkers active).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:24:16 +00:00
2d7a566624 docs(p10-1c-jk): README/HANDOFF/ARCHITECTURE/SMOKE/INDEX + design §10.1; chore: bump version 0.12.0 → 0.13.0
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:38:40 +00:00
f95cd55484 docs(p10-1c-go): README/HANDOFF/ARCHITECTURE/SMOKE/INDEX + design §10.1; chore: bump version 0.11.1 → 0.12.0
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:02:21 +00:00
44813df052 docs(p10-1b): README/HANDOFF/ARCHITECTURE/SMOKE/INDEX + HOTFIXES; chore: bump version 0.7.0 → 0.8.0
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 01:48:06 +00:00
80c2d31fb3 docs(p10-1a-2): README/HANDOFF/ARCHITECTURE/SMOKE/INDEX + HOTFIXES; chore: bump version 0.6.0 → 0.7.0
- README: note Rust .rs ingest active (code-rust-ast-v1), update Mermaid parse node + chunker labels, update supported formats note in Quick start and ingest command table; add code citation fields (symbol, code_lang, repo) and filter flags note
- HANDOFF: flip P10 row to note 1A-1  + 1A-2 PR open; add one-liner cross-link to HOTFIXES 2026-05-19 entries
- ARCHITECTURE: add kebab-parse-code node + edge (app → pcode, pcode → ptypes) to Mermaid graph; add directory tree entry; add code parser locked-in decision row (tree-sitter lives parser-side, design §6.3)
- SMOKE: add P10-1A-2 Rust code ingest section (ingest.code config keys, verification steps, known behaviors); add checklist item
- tasks/INDEX.md: flip p10-1A-1 to , update p10-1A-2 to 🟡 PR open
- tasks/p10/INDEX.md: same flips
- tasks/HOTFIXES.md: add two 2026-05-19 dated entries (AST_CHUNK_MAX_LINES constant vs config deviation + SourceType::Code deferred)
- tasks/p10/p10-1a-2-rust-ast-chunker.md: append two HOTFIXES cross-link lines in Risks/notes
- docs/superpowers/specs/2026-04-27-kebab-final-form-design.md §10.1: note p10-1A-2 surface activation
- Cargo.toml: version 0.6.0 → 0.7.0 (dogfooding-ready = minor bump trigger per CLAUDE.md)
- Cargo.lock: regenerated

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 22:48:11 +00:00
7a6a24ad10 feat(p10-1a-2): add MediaType::Code(lang) variant
TDD: red → green cycle confirmed. New `Code(String)` variant serializes
as `{"code":"rust"}` via serde `rename_all = "lowercase"`. All exhaustive
`match` sites updated (`media_label`, `ingest_one_asset` catch-all →
explicit or-pattern). Design §3.5 enum listing synced. Also fix
`/target` symlink gitignore pattern so integration-test binary lookup
via workspace-relative path works with CARGO_TARGET_DIR redirect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 17:14:45 +00:00
9f3edb7e24 feat(p10-1a-2): add internal SourceSpan::Code variant + design §3.4 sync
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 15:52:01 +00:00
th-kim0823
7961f8813d fix(p10-1a-1): PR review round 1 — doc inconsistencies
회차 1 review 의 4 건 actionable 모두 반영:

1. frozen design §2.1 의 code variant 예시에서 존재하지 않는 `repo` 필드 제거 + nested form 에서 actual wire (flat) 형태로 정리. 5 variant 의 nested-form illustrative example 은 그대로 두고, code variant 만 별도 block 으로 분리해서 actual wire 와 1:1 매칭. 또 위쪽 6 variant nested-form group 에서도 'code' 행 삭제 (정확한 contract 는 별도 block 에 있음).
2. §2.2 SearchHit 예시의 `repo: null, code_lang: null` + 'omitted when null' 주석 모순 제거 — 키 자체를 빼고 inline 주석으로 'markdown hit 에는 absent, 코드 hit 에서만 surface' 설명.
3. HANDOFF Phase row 식별자 `**10**` → `**P10**` (다른 row 와 일관성).
4. README synopsis 의 중복 `[--media code]` 제거 (`--media` 는 이미 위쪽에 한 번 있음, code 는 값 중 하나라 prose 에서 설명).

코드 변경 없음 — 모두 markdown 문서.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 18:24:15 +09:00
th-kim0823
7bbd2c0cbf docs(p10-1a-1): wire schema + frozen design + README/HANDOFF/SMOKE + task index
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 17:41:26 +09:00
th-kim0823
005a9011ea plan(p10-1a-1): code ingest framework implementation plan + spec wire-shape fix
21 task plan: kebab-core 도메인 타입 (Citation::Code variant, SearchHit repo/code_lang, IngestReport skip counters, Metadata extension), 새 kebab-parse-code crate (lang/repo/skip 모듈, gix dep), kebab-source-fs gitignore+blacklist 통합, kebab-config [ingest.code] 절, kebab-cli --repo/--code-lang flag, wire schema JSON 갱신, frozen design doc 갱신, README/HANDOFF/SMOKE 갱신, task index. 각 task 가 5-step TDD cycle (test fail → impl → pass → commit). 코드 chunker 는 1A-1 에 없음 — 1A-2 에서 추가.

spec 의 Citation::Code 예시가 기존 5 variants 의 flat wire 형태와 안 맞아서 (`code: {...}` 중첩이 아니라 top-level field) 같이 fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 14:31:22 +09:00
th-kim0823
c6d61b0b37 spec(p10): split Phase 1A into 1A-1 (framework) and 1A-2 (Rust chunker)
1A 가 들고 들어가는 *프레임워크 surface* (Citation `code` variant, SearchHit repo/code_lang, --media code / --code-lang / --repo filter, skip 정책, IngestReport 세분화, config 절, kebab-parse-code crate skeleton) 가 *언어 chunker 자체* 와 독립 검증 가능 — 1A-1 머지 후 기존 markdown corpus 의 wire 출력이 byte-level identical 한지 regression test 로 검증한 다음 1A-2 에서 Rust AST chunker 자체에 집중. binary version bump 트리거도 1A-2 로 미룸 (1A-1 은 wire additive minor + 사용자 surface 변경 없음).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 14:20:10 +09:00
th-kim0823
49487dc46b spec(p10): code ingest design — Tier 1 AST + Tier 2 resource + Tier 3 fallback
수십 개 git repo (한 부모 dir 아래) 를 corpus 로 확장. Tier 1 (Rust/Python/TS-JS/Go/Java/Kotlin/C/C++) 은 tree-sitter AST per-language chunker, Tier 2 (k8s manifest / Dockerfile / Cargo.toml 류) 는 resource-aware chunker, Tier 3 (shell / fallback) 는 paragraph + line-window. embedding 은 multilingual-e5-large 유지 — cross-corpus 검색 위해. Phase 1A (Rust) 부터 1D (C/C++) + Phase 2 (Tier 2) + Phase 3 (Tier 3) 순으로 진행. ignore 통합 (.gitignore honor + .kebabignore 추가 + 최소 built-in safety net), generated header sniff, size cap 으로 첫 도그푸딩 비용 차단. 새 Citation variant `code`, SearchHit 의 repo/code_lang 필드, --media code / --code-lang / --repo filter — 모두 additive minor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 14:15:59 +09:00
th-kim0823
c62a8ff503 docs(fb-39b): design + HOTFIXES + new task spec + INDEX + README + SMOKE
Tasks 4 + 5: comprehensive doc update for embedding upgrade (multilingual-e5-large).

- design §5 + §9: update embedding_model / dimensions references (384 -> 1024)
- HOTFIXES: add fb-39b entry with user re-ingest procedure + backwards-compat notes
- tasks/p9-fb-39b-embedding-upgrade.md: new task spec (completed status)
- INDEX.md: add fb-39b row under RAG quality phase
- fb-39 task banner: append fb-39b link as lever implementation
- README: update config defaults + fastembed model size + embedding field docs
- SMOKE.md: append embedding upgrade verification section with e5-small -> e5-large sequence

Wire schema: no change (additive at config level, new table created by existing code).
Binary version: 0.6.0 -> 0.7.0 (cascade rule: embedding_model change = minor bump).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 23:28:48 +09:00
th-kim0823
2c3461c465 spec(fb-39b): embedding model upgrade design
- multilingual-e5-small (384 dim) → multilingual-e5-large (1024 dim)
- Cascade: embedding_version bump → fb-23 incremental ingest
  re-embeds all chunks
- Migration policy: dim mismatch detection at LanceVectorStore::open
  → error.v1 (code = embedding_dim_mismatch) + hint
  "kebab reset --vector-only && kebab ingest"
- Config defaults flip (model + dimensions). User TOML pinning small
  preserves backwards-compat
- bge-m3 deferred (fastembed enum 미포함, UserDefinedEmbeddingModel
  ONNX path 별도)
- Release trigger: 0.6 → 0.7 minor bump per CLAUDE.md cascade rule

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 22:59:03 +09:00
th-kim0823
f00fb376fe docs(fb-39): golden header + design §10.3 eval + spec status + INDEX
Strengthen fixtures/golden_queries.yaml header with precision_at_k_chunk
explanation + measurement guidance. Add §10.3 Eval metrics section to
frozen design documenting retrieval metrics (hit@k, MRR, recall@k_doc,
P@k_chunk) + groundedness metrics. Flip p9-fb-39 spec status from open
→ completed (eval foundation only, lever deferral noted). Update
tasks/INDEX.md fb-39 row mirror to fb-42 (merged, deferred note).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 22:35:15 +09:00
th-kim0823
cd5b1e3bfc spec(fb-39): eval foundation design (P@k metric)
- AggregateMetrics 에 precision_at_k_chunk: BTreeMap<u32, f32>
  (P@5, P@10) 추가, binary relevance via expected_chunk_ids
- Denominator = k 고정 (hits.len() < k 도 precision 손실 간주)
- Empty expected_chunk_ids query 는 skip (hit_at_k 동일 정책)
- Lever 적용 (chunk policy / RRF / cross-encoder / embedding) 은
  본 spec 범위 외 — fb-39b 이후 별도 task
- Golden set schema 무변경, shipped fixtures 헤더 주석만 강화

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 22:05:09 +09:00
th-kim0823
441f1192ee docs(fb-42): wire schema + README + SMOKE + design + SKILL + INDEX
- Add bulk_search_item.v1 + bulk_search_response.v1 wire schemas
- Register both in WIRE_SCHEMAS const
- README: --bulk flag mention + MCP tool list 7→8 (bulk_search)
- SMOKE: bulk multi-query walkthrough (CLI + MCP equivalent)
- Design §2.2: Bulk multi-query (fb-42) subsection (additive minor)
- SKILL: mcp__kebab__bulk_search section + tool table row
- Task spec status open→completed, banner replaced
- INDEX: fb-42 row 머지 (rerank hint deferred)
- Fix: missed Capabilities {bulk_search} in cli wire.rs test (Task 7 leftover)
- Fix: missed tools.len() 7→8 in cli_mcp_smoke (Task 5 leftover)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:07:36 +09:00
th-kim0823
35df15df99 spec(fb-42): bulk multi-query design (rerank hint deferred)
- CLI: kebab search --bulk + stdin ndjson → stdout per-query ndjson
- MCP: 신규 kebab__bulk_search tool + JSON envelope (results + summary)
- Sequential for-loop, App instance 재사용 (cache amortize)
- Per-query error policy: continue + per-item error.v1
- Limits: queries.len() <= 100
- Capability flag bulk_search 신규
- Rerank hint 별도 task (fb-39 cross-encoder 설계 후)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:05:27 +09:00
th-kim0823
600c6182fc docs(fb-40): rag-v2 prompt + README + design + SKILL + INDEX
- README: [rag] prompt_template_version default rag-v2 + V2 강화 3 규칙
- design §7: rag-v2 본문 + V1 legacy note
- SKILL.md: mcp__kebab__ask 응답 행태 변화 안내
- task spec: status open → completed, design + plan 링크
- INDEX: fb-40  머지 (2026-05-10)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:37:28 +09:00
th-kim0823
28d3250546 spec(fb-40): fact-grounded answer design
- rag-v1 → rag-v2 system prompt with 3 신규 규칙 (verbatim span 인용 자도 /
  학습 지식 동원 금지 / 추측 금지)
- system_prompt_for(version) helper dispatch in pipeline
- config default prompt_template_version "rag-v1" → "rag-v2", V1 legacy
  kept for backwards-compat
- Lever C (pre-LLM gate) already shipped (RefusalReason::ScoreGate),
  out of scope here

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:55:05 +09:00
th-kim0823
c864bd007f docs(fb-38): wire schema + README + design + SKILL + INDEX 2026-05-10 18:21:55 +09:00
th-kim0823
0359bd9682 spec(fb-38): score semantics design
- search_hit.v1 에 optional score_kind 필드 (rrf | bm25 | cosine)
- LexicalRetriever → Bm25, VectorRetriever → Cosine, HybridRetriever → Rrf
- fb-37 search_with_trace 의 mode-dispatch hits 는 underlying retriever 의
  score_kind 그대로 보존
- README + design §4 + SKILL 에 RRF 수식 전체 + "ranking signal, NOT confidence"
  안내, agent 용 trust threshold 는 nested retrieval.{lexical,vector}_score
- additive minor wire — schema bump 없음

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 17:40:47 +09:00
th-kim0823
5f6b2fa259 spec(fb-37): trace + stats design
- search --trace boolean flag, additive optional `trace` field on search_response.v1
- HybridRetriever search_with_trace returns (hits, SearchTrace) — lex/vec/rrf_inputs + per-stage timing
- cache bypass when --trace (debug intent)
- schema.v1.stats extended with media_breakdown / lang_breakdown / index_bytes / stale_doc_count
- TUI search pane `t` keystroke opens TracePopup
- additive minor wire — no schema bump

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 12:05:31 +09:00
th-kim0823
7210386699 spec(fb-36): search filter args — design
`kebab search` 에 7 flag 노출 (기존 4 + 신규 3):
- --tag (반복) / --lang / --path-glob / --trust-min (기존 SearchFilters)
- --media (csv) / --ingested-after (RFC3339) / --doc-id (신규)

filter layer = SQLite WHERE (lexical) + over-fetch+post-filter
(vector). AND 결합. wire schema 무변경 (input only).

`SearchFilters` 3 필드 additive (#[serde(default)] 로 backwards-
compat). MCP SearchInput 7 optional 필드 추가. invalid RFC3339 →
error.v1.code = config_invalid.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 03:26:40 +09:00
th-kim0823
7dddc1d706 fix(fb-35): address PR #126 round 1 review
- fetch_span: panic-fix on line_start > total / empty doc
  (return empty text + effective_end = line_start - 1 instead of
  out-of-bounds slice)
- truncated: reserved for budget-driven truncation only; line
  range clamp signaled via effective_end < line_end
- spec / SKILL.md / README: align rejection wording to "PDF /
  audio" (matches code; Image OCR allowed for span)
- store: warning comment on list_chunk_ids_for_doc — chunk_id
  hash sort does NOT preserve document position; real fix is a
  chunks.ordinal column, tracked as follow-up
- surrounding_chunks: saturating_add to defend against u32::MAX
  context arg on 32-bit targets
- tests: line_start > total returns empty + chunk context at
  doc boundary clamps lower bound

Deferred nits (follow-up): table-separator strict CommonMark form;
MCP per-mode strict validation; CLI chunk_id truncation in plain
output. None block correctness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:45:29 +09:00
th-kim0823
4eda9c317d spec(fb-35): verbatim fetch — design
`kebab fetch chunk|doc|span` 신규 subcommand + MCP `kebab__fetch`
tool. wire = `fetch_result.v1` (kind discriminator).

source = CanonicalDocument / chunks.text 정규화된 markdown (raw
bytes 미노출). chunk mode `--context N` = ordinal ±N. doc/span
mode = fb-34 budget 재사용 (chars/4). PDF/audio span 은
`error.v1.code = span_not_supported` 거절.

신규 error codes: chunk_not_found / doc_not_found /
span_not_supported / invalid_input. fb-34 StructuredError
wrapper 재사용.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 23:21:01 +09:00
th-kim0823
a80f65c6f2 spec(fb-34): output budget controls — design
`kebab search` 에 --max-tokens / --snippet-chars / --cursor 신규.
chars/4 token approximation. truncate priority: snippet → k → 멈춤
(최소 1 hit 보장). cursor = opaque base64(offset + corpus_revision)
— mismatch 시 error.v1.code = stale_cursor.

wire breaking: stdout array → search_response.v1 wrapper. agent 갱신
필요. App::search 시그니처는 thin wrapper 로 보존 (TUI 무영향).

ask path 는 scope out (rag.max_context_tokens 가 이미 budget 담당).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 17:36:51 +09:00
th-kim0823
4949775c8b spec(fb-33): streaming ask (ndjson delta) — design
3-variant StreamEvent enum (RetrievalDone / Token / Final) 을 통해
RagPipeline 이 retrieval / per-token / final 단계를 sink 로 발사.
CLI `kebab ask --stream` 이 ndjson event 를 stderr 로 흘리고 final
stdout line 은 기존 answer.v1 그대로 (ingest_progress.v1 패턴).
Cancel = stdout 닫힘 → SendError → LLM stream break +
RefusalReason::LlmStreamAborted 로 partial answer 기록.
MCP streaming 은 v0.5+ 별도 검토 (scope out).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 14:10:08 +09:00
th-kim0823
401a47fb43 spec(fb-32): stale doc indicator — design
검색 hit / RAG citation 에 indexed_at + stale 두 wire 필드 추가.
documents.updated_at 재활용 (V006 incremental ingest 가 자연 source-of-truth).
config [search] stale_threshold_days = 30 default. additive minor wire.
TUI Warning role / CLI plain [stale] tag / agent --json 동시 surface.
자동 재 ingest 는 out of scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 18:00:10 +09:00
th-kim0823
9d96504bd9 docs(spec): fb-26 + fb-28 + schema-sync design doc
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 19:11:47 +09:00
th-kim0823
345a4f363a 📝 docs: sync README / HANDOFF / CLAUDE / skill / design for fb-31
- README 명령 표 에 `kebab ingest-file` + `kebab ingest-stdin` 두 row + MCP tool list 4 → 6.
- HANDOFF post-도그푸딩 항목 한 줄.
- CLAUDE.md `_external/` 디렉토리 + naming convention 한 줄.
- integrations skill — Recipe D (agent fetched web doc) + MCP tool list 갱신.
- design §6.7 `_external/` subdirectory 절 신설.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:16:45 +09:00
th-kim0823
7772fbc00f 📝 docs(spec): p9-fb-31 single-file / stdin ingest 설계 문서
신규 명령 `kebab ingest-file` + `kebab ingest-stdin` + MCP tool
`ingest_file` + `ingest_stdin` 도입 brainstorm 산출물. agent fetch 한
web markdown / 단일 외부 file 을 KB 에 즉시 저장.

핵심 결정:
- 외부 file 저장: copy in (`<workspace.root>/_external/<hash12>.<ext>`).
  blake3 content hash 기반 deterministic 명명 → idempotent.
- CLI: 신규 subcommand 2개 (기존 `kebab ingest` 무영향).
- MCP: 4 → 6 tool. fb-30 v1 read-only 정책 변경 — 첫 mutation tool
  surface (의도된 진화).
- .kebabignore: explicit ingest 가 default bypass + stderr warn.
- stdin v1: markdown 전용 + flag (--title, --source-uri) → frontmatter
  자동 prepend. 이미 frontmatter 있으면 error (use ingest-file).
- `_external/` 디렉토리 첫 생성 시 .kebabignore 자동 append (walk
  re-ingestion 무한 루프 방지).
- source_uri 는 frontmatter → Document.metadata 자동 흐름. wire
  schema 변경 없음 (ingest_report.v1 / search_hit.v1 의 metadata
  free-form map 재사용).

릴리스: 0.3.1 → 0.3.2 patch — additive only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:29:30 +09:00
th-kim0823
f758d51a01 📝 docs: sync README / HANDOFF / CLAUDE / skill / design for fb-30
- README 명령 표 에 `kebab mcp` 추가 + Claude Code MCP config 예시
- HANDOFF post-도그푸딩 항목 한 줄 (rmcp 1.6 + manual dispatch + error_wire promotion + ask/search spawn_blocking + capability flag flip 명시)
- CLAUDE.md facade 룰 의 UI crate 카테고리 에 `kebab-mcp` 추가
- integrations skill — MCP 사용 안내 (recommended over subprocess)
- design §10.2 MCP transport 절 신설

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:15:19 +09:00
th-kim0823
58ec4578b9 📝 docs(spec): p9-fb-30 MCP server (stdio) 설계 문서
`kebab mcp` 신규 subcommand + new crate `kebab-mcp` 도입을 위한
brainstorm 산출물. agent integration "MVP" 완성 (Claude Code / Cursor /
OpenAI Agents 등 host-agnostic 사용 가능).

핵심 결정:
- `kebab mcp` subcommand (kebab-cli 내, 신규 binary 아님)
- 4 read-only tool (`search` / `ask` / `schema` / `doctor`) — ingest /
  fetch / list_docs / inspect_chunk 는 fb-31 / fb-35 / 후속에서 추가
- Resources / Prompts 모두 skip (tools only)
- Rust MCP SDK 사용 — rmcp 채택 우선, plan 단계 verify
- stdio 단일 transport — fb-29 deferral 따라 HTTP-SSE P+
- error mapping: tool dispatch 실패만 isError=true + error.v1 content,
  refusal / no-hit / unhealthy 는 정상 응답 (semantic flag 으로 분기)
- classify 모듈 이전: kebab-cli::error_classify → kebab-app::error_wire
  (kebab-cli + kebab-mcp 둘 다 동일 모듈 사용, facade 룰 준수)
- capability flag `mcp_server` false → true

릴리스: 0.3.0 → 0.4.0 minor — 신규 surface + new crate + 디자인 §10.1
변경 + capability flip 모두 trigger.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 14:44:09 +09:00
th-kim0823
bb7b1cec4b 📝 docs: sync README / HANDOFF / CLAUDE / skill / design for fb-27
- README 명령 표 에 `kebab schema` 추가
- HANDOFF post-도그푸딩 항목 한 줄
- CLAUDE.md wire schema 절 schema.v1 / error.v1 추가
- integrations skill — schema 활용 안내 (additive)
- design §10.1 capability matrix subsection 신설

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:37:40 +09:00
th-kim0823
f01f8dfc9b 📝 docs(spec): p9-fb-27 introspection + error wire 설계 문서
`kebab schema` 신규 명령 + `error.v1` wire schema 도입을 위한 brainstorm
산출물. agent 통합 (fb-30 MCP, fb-29 daemon) 의 prerequisite — 한 번의
introspection 호출로 wire 버전 / capabilities / model versions / index
stats 를 노출하고, fatal error 가 `--json` 모드에서 stderr ndjson 으로
구조화된다.

핵심 결정:
- `kebab schema --json` 단일 명령 (정적 + 동적 통합)
- error.v1 emission 은 `--json` 모드에서만 — 비 `--json` 은 기존 stderr text 유지
- exit code 0/1/2/3 unchanged, error.v1.code 가 fine-grained 분기
- 7 code initial set (config_invalid / not_indexed / model_unreachable /
  model_not_pulled / timeout / io_error / generic) + future-additive 정책

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 10:58:24 +09:00
40ca4bf27e spec(p9-fb-25): config workspace.include 제거 + 지원 형식 가시성
도그푸딩 피드백 2026-05-05:
1. config 의 include + exclude 동시 존재가 case 4 (둘 다 매치 안 함)
   에서 의미 모호.
2. 어차피 처리 가능 형식 (md / png / jpg / pdf) 이 정해져 있으니
   사용자에게 명시 필요.

설계 핵심:

- `WorkspaceCfg.include: Vec<String>` 제거 (denylist-only). 옛 config
  의 `include = [...]` 은 silently 무시 + Config::load 가 단발
  deprecation warning emit.
- `IngestItem.warnings` 에 skip 사유 채움 (`unsupported media type:
  .docx` / `kb:// URI not yet supported`).
- `IngestReport.skipped_by_extension: BTreeMap<String, u32>` 신규
  (additive wire — release 트리거 안 됨 per CLAUDE.md). key =
  lowercase ext (`docx`, `txt`), no-ext = `<no-ext>` sentinel.
- CLI / TUI summary 에 breakdown 표시 (`90 skipped: 80 docx, 5 txt,
  5 epub`) — 모두, desc 정렬.
- README + `kebab init` config.toml 주석에 지원 형식 명시.

Spec status `planned`. 다음 단계: writing-plans skill 로 implementation
plan 작성.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:21:42 +00:00
ddec3d8a94 spec(p9-fb-23): incremental ingest — skip unchanged docs
도그푸딩 피드백: 변경/신규 doc 만 ingest, 변하지 않은 문서는 skip.

설계 핵심:

- Skip 조건 4 개 (full version cascade): blake3 checksum + parser_version
  + chunker_version + embedding_version 모두 일치 시 parse/chunk/embed/
  vector upsert 회피. 비용 dominator (fastembed) 가 변경된 / 새 doc 에만.
- SQLite V006 migration — `documents` 에 `last_chunker_version` +
  `last_embedding_version` column 추가. 기존 row NULL → 첫 ingest 강제
  재처리 (안전 default).
- `IngestItemKind::Unchanged` enum variant 신규 (기존 `Skipped` 와
  의미 분리 — `Skipped` 는 media-type 필터, `Unchanged` 는 모든 versions
  match).
- `IngestReport` + `AggregateCounts` 에 `unchanged: u32` 필드 추가.
  wire schema additive — v1 호환 유지.
- `--force-reingest` flag — skip 무시하고 강제 재처리.
- TUI status_line final 에 `unchanged=N` 노출 (p9-fb-24 status bar
  dynamic slot 자동 cascade).

Spec status `planned`. 다음 단계: writing-plans skill 로 implementation
plan 작성.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 17:27:06 +00:00
0c76677131 spec(p9-fb-24): TUI status/key bar + Library header + page scroll
도그푸딩 피드백 3 건 (Library 컬럼 헤더 부재, PgUp/PgDn 페이지 스크롤,
모든 모드에서 항상 떠 있는 상태바 + 키 안내바 + 버전 정보) 을 단일
spec 으로 묶음.

설계 핵심:

- bottom 영역을 2 row 로 분할: 윗줄 = 상태바 (`kebab v0.1.0 │ pane │
  doc_count │ 동적 상태`), 아랫줄 = 기존 footer_hints 그대로 이전.
- ingest progress 의 dedicated row 를 status bar 의 동적 영역으로 흡수
  (시각적 source 단일화).
- Library `List` 위에 `format_doc_header` 헤더 row 추가 (TITLE / TAGS
  / UPDATED / CHUNKS, display-width 정렬, Role::Heading).
- Ask + Inspect 양쪽에 PgUp/PgDn (fixed step 10). Ask 는 j/k 와 동일
  하게 follow_tail = false 로 freeze.

p9-fb-13 (footer 단행 row) + p9-fb-03 (ingest dedicated row) frozen
spec 들과 layout 충돌. frozen 텍스트는 그대로 두고 본 spec + 머지 후
HOTFIXES `2026-05-04 — p9-fb-24` 항목이 live source of truth.

Spec status `planned`. 다음 단계: writing-plans skill 로 implementation
plan 작성.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 16:11:17 +00:00
th-kim0823
bfdb122a80 docs(spec): component documentation design (2026-05-04)
Per-component README pages under docs/components/<group>/, grouped by
responsibility (12 groups). Each page carries 구조 + flow mermaid +
key-decision rationale consolidated from HOTFIXES + spec. Index page
hosts group-wiring diagram; ARCHITECTURE crate graph migrates from
ASCII to mermaid.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:47:28 +09:00
c97e8e00ef feat(kebab-core + kebab-store-sqlite): p9-fb-17 chat session storage (V005)
도그푸딩 item 13/14 (multi-turn 영속화) — TUI Ask 의 "이전 대화
이어가기" + 향후 CLI `--session foo` (p9-fb-18) backing store. session
header + per-turn 두 테이블, ON DELETE CASCADE 로 reset --data-only 가
한꺼번에 wipe.

## 핵심 변경

- **SQLite V005 migration** `chat_sessions` (session_id PK + created_at
  + updated_at + title + config_snapshot_json) + `chat_turns` (turn_id
  PK + session_id FK ON DELETE CASCADE + turn_index + question +
  answer + citations_json + created_at + UNIQUE(session_id, turn_index))
  + `idx_chat_turns_session(session_id, turn_index)`. 모두 `STRICT`.
- **`kebab_core::ChatSessionRepo`** trait (6 method): create_session /
  get_session / list_sessions(limit, ORDER BY updated_at DESC) /
  delete_session / append_turn / list_turns(ORDER BY turn_index ASC)
- **`kebab_core::{ChatSessionRow, ChatTurnRow}`** structs — Serialize
  + Deserialize 둘 다 (CLI / wire 출력 호환)
- **`kebab-store-sqlite::SqliteStore`** impl 신규 모듈 `chat_sessions.rs`.
  `append_turn` 이 insert + parent updated_at bump 같은 connection
  에서 처리.
- **frozen design §5** 에 §5.7a chat_sessions / chat_turns 절 신설
  (full schema + trait 메서드 6 개 명시).

## HOTFIXES (V004 → V005)

spec p9-fb-17 의 `V004__chat_sessions.sql` 가 p9-fb-19 의
`V004__kv.sql` (이미 머지) 와 refinery migration number 충돌. 무중단
정정: `V005__chat_sessions.sql` 로 시프트. schema / 동작 동일, 파일명
만 이동. HOTFIXES entry 추가.

## 테스트

- 9 신규 integration unit (create/get roundtrip, missing→None, PK
  collision error, append+list ordered, dup turn_index error,
  append bumps updated_at, delete CASCADE turns, list_sessions
  ORDER BY updated_at DESC, list_sessions LIMIT)
- workspace 전체 `cargo test --workspace --no-fail-fast -j 1` exit 0
- `cargo clippy --workspace --all-targets -- -D warnings` clean

## 문서

- frozen design §5.7a 신설
- HANDOFF: 2026-05-03 entry
- HOTFIXES: V004 → V005 rename rationale
- spec status planned → in_progress

## Out of scope

- session 검색 / 필터 UI (p9-fb-18 의 `kebab ask --session list`
  같은 admin command 가 후속)
- 다른 store backend (postgres 등) — trait 만 정의, impl 은 SQLite

unblocks p9-fb-18 (CLI session/repl).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:37:53 +00:00
0e408fb1b5 feat(kebab-app + kebab-store-sqlite): p9-fb-19 search LRU cache + corpus_revision
도그푸딩 item 15 — TUI / 같은 process 안에서 동일 query 반복 시 SQLite
FTS + Lance + RRF 재계산이 매번 발생하던 비용 해소. in-process LRU
캐시 + 모노토닉 corpus_revision 카운터로 ingest commit 발생 시 모든
entry 자동 stale.

## 핵심 변경

- **SQLite V004 migration**: `kv (key TEXT PRIMARY KEY, value TEXT)
  STRICT` + `corpus_revision = '0'` seed. 미래의 다른 scalar 도 같은
  테이블에 들어갈 수 있는 generic shape.
- **`SqliteStore::corpus_revision()` / `bump_corpus_revision()`** —
  `UPDATE ... CAST AS INTEGER + 1` atomic. INSERT-OR-IGNORE 도 함께
  실행 (V004 seed 가 무슨 이유로 누락된 케이스 paranoid).
- **`kebab-app::ingest_with_config_cancellable`** — `new + updated > 0`
  시 bump, no-op (skipped-only) reingest 는 cache 보존.
- **`App.search_cache: Option<Mutex<LruCache<SearchCacheKey, Vec<
  SearchHit>>>>`** — `config.search.cache_capacity` (default 256, 0
  비활성). `lru = "0.12"` workspace dep 추가.
- **`SearchCacheKey`** = `query_norm` (NFKC + trim + lowercase) +
  `mode` + `k` + `snippet_chars` + `embedding_version` (vector/hybrid
  만, lexical 은 빈 문자열) + `chunker_version` + `corpus_revision`
  snapshot.
- **`App::search`** rewrite — cache 활성 시 lookup → miss 면 기존
  `search_uncached` 호출 후 put. cache 비활성이거나 lock 실패면
  straight-line.
- **`App::search_uncached`** (rename of pre-fb-19 `search` body) +
  `search_uncached_with_config` facade — CLI `kebab search --no-cache`
  로 진입.
- **`Config.search.cache_capacity: usize`** field, `#[serde(default)]`
  로 기존 config 호환.
- **CLI `--no-cache`** flag — 디버깅용 (CLI 는 매 호출이 새 process
  라 사실상 no-op 이지만 spec 명시 + 향후 long-lived process 호환).
- **frozen design §9 versioning** 표에 `corpus_revision` row 추가
  (기존 `index_version` 라벨과 다른 차원: 라벨은 retrieval 형상,
  corpus_revision 은 ingest commit ack).

## 테스트

- `kebab-store-sqlite` 신규 3 unit (fresh=0, monotonic bump, persist
  across reopen)
- `kebab-app` 신규 4 integration (cached repeat 같은 hits, NFKC 정규화
  로 case/whitespace collapse, --no-cache parity, first ingest bumps
  corpus_revision)
- 워크스페이스 전체 `cargo test --workspace --no-fail-fast -j 1` exit 0
- `cargo clippy --workspace --all-targets -- -D warnings` clean

## 문서

- README `kebab search` 행: 캐시 동작 + `--no-cache` 안내 + corpus_
  revision 무효화 메커니즘
- docs/SMOKE.md `[search]` 절에 `cache_capacity` 라인 추가
- HANDOFF: 2026-05-03 entry
- spec status planned → in_progress

## Out of scope

- patch-and-merge incremental (RRF 정규화 전체 hit set 기준이라 어려움)
- SQLite 영속 cache (P+)
- 다른 process 간 cache 공유 (in-process 만 — corpus_revision 이
  cross-process 무효화는 O(1))

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:01:31 +00:00
6480b463bc review(회차1): RefusalReason::LlmStreamAborted variant 추가 + 빈 줄 정리
회차 1 actionable 2건 반영.

- §3.8 RefusalReason enum 에 LlmStreamAborted variant 추가 + doc
  comment (RAG retrieval 정상, model generation 단계에서만 중단).
  spec PR self-contained 원칙 — impl PR 이 spec 변경 없이 진행
  가능.
- Multi-turn behaviour 절 끝 빈 줄 2 → 1 + RefusalReason 정의
  cross-link 한 줄 추가.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 22:11:52 +00:00
ecb85651ea spec(p9-fb-15): RAG multi-turn 정책 + answer.v1 conversation_id/turn_index
도그푸딩 후 추가된 ask multi-turn (꼬리 물기) surface 를 frozen design
+ wire schema 에 명시. p9-fb-15 (RAG core) + p9-fb-16 (TUI UI) +
p9-fb-17 (V004 chat sessions) + p9-fb-18 (CLI session/repl) 의 spec
PR — impl PR 들이 이어진다.

변경:
- §2.3 Answer wire schema: conversation_id (String?) + turn_index
  (u32?) 두 optional 필드. 기존 single-shot 소비자 (외부 wrapper)
  영향 없음 — 두 필드 모두 optional.
- §3.8 RAG types:
  - Answer struct 에 conversation_id / turn_index field 추가.
  - Turn struct 신설 (history 가 prompt 에 들어갈 때 한 turn).
- §3.8 \"Multi-turn behaviour\" 신설 절:
  - kebab-rag::ask vs ask_with_history 두 entry.
  - prompt 빌드 priority: system+question (필수) → retrieved chunks
    (k 줄여 fit) → history (newest 우선, oldest drop).
  - retrieval query expansion (직전 answer 첫 200자 concat).
  - Aborted vs Completed semantics — ask 는 single-shot 이라 cancel
    시 partial token + grounded=false + LlmStreamAborted refusal
    (variant 추가는 p9-fb-15 impl 가 함께).
- docs/wire-schema/v1/answer.schema.json: 두 필드 추가 +
  created_at 에 format: date-time (sibling ingest_progress.v1 와
  일관).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 22:09:34 +00:00
9aa7459e87 review(회차1): nit 3건 반영
- §10 long-running 절 끝 빈 줄 3 → 1 (다른 절 사이 일관)
- wire schema + §2.4a 예제 JSON: kind_result → result (top-level
  kind 와의 모호성 제거; ingest_report.v1.items[].kind 와 짝)
- wire schema 의 ts 필드: format: \"date-time\" 추가 (RFC 3339
  자동 검증, wrapper 가 다른 format emit 시 즉시 잡힘)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 19:18:32 +00:00
5ef8598e5c spec(p9-fb-01..03): ingest progress events + cancellation in §2.4a / §10
도그푸딩 후 추가된 long-running 작업 진행 표시 + cancel 정책을 frozen
design 에 명시. p9-fb-01/02/03 (ingest progress callback / CLI display
/ TUI background) 의 spec PR — impl PR 들이 이어진다.

변경:
- docs/wire-schema/v1/ingest_progress.schema.json (신규):
  line-delimited streaming event schema. discriminated by `kind`
  (scan_started → scan_completed → asset_started → asset_finished* →
  embed_batch_* → completed | aborted). 마지막 줄은 기존
  ingest_report.v1 그대로 (외부 wrapper backward-compat).
- 2026-04-27-kebab-final-form-design.md §2.4a (신규):
  IngestProgressEvent 절. 이벤트 ordering / aborted 의 idempotency /
  CLI 의 stderr vs stdout 분리 / TUI · desktop 의 in-memory 소비.
- 2026-04-27-kebab-final-form-design.md §10:
  long-running 작업 (ingest, future eval run, RAG streaming, embed
  batch) 의 두 invariant — progress 의 단일 source / cooperative
  cancel + step boundary. trait (§7.2) 시그니처는 무영향 — facade
  hidden parameter 로 추가.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 19:14:37 +00:00
f9714aa5cb docs(rename): kb → kebab — README, tasks/, docs/, design doc, report
마지막 commit. 모든 .md 안의 `kb` 단어 일괄 갱신.

- 19 개 crate 이름 (`kb-core`, `kb-app`, …) → `kebab-*` (Rust 모듈
  path 표기 `kb_*` → `kebab_*` 포함).
- 미래 component (`kb-tui`, `kb-desktop`, `kb-asr-whisper`, `kb-ocr`,
  `kb-mcp`, `kb-vlm`, `kb-rerank`, `kb-vision-ocr`, `kb-index`,
  `kb-smoke`, `kb-architecture`) → `kebab-*` (P6+ 가 시작될 때
  같은 prefix 사용).
- CLI 명령 예제: `kb ingest` / `kb search` / `kb ask` / `kb init` /
  `kb doctor` / `kb inspect` / `kb list` / `kb eval` →
  `kebab <verb>`. fenced code block + 인라인 backtick 모두.
- XDG paths + env vars + binary 경로 (`target/release/kb` →
  `target/release/kebab`) 동기화.
- design doc / 최초 보고서 / SMOKE / HOTFIXES / phase epic / task
  spec 모든 reference 통일.
- task-decomposition.md 의 `git -c user.name=kb` 는 과거 git history
  기록용 author 정보라 그대로 유지 (실제 git history 의 author 는
  변경 불가).
- `tasks/phase-5-evaluation.md` 의 `status: planned` →
  `completed` 도 같이 (P5-1 + P5-2 PR 머지 후 미반영분).

## 검증

- `grep -rEn "\bkb-[a-z]|\bkb_[a-z]|\.config/kb\b|kb\.sqlite|\bKB_[A-Z]"
   --include="*.md"` 0 hits (task-decomposition.md 의 git author
  제외).
- 모든 file path reference 살아있음 (renamed file 들 모두 새 path
  로 update).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 04:01:55 +00:00
f1a448d6dc refactor(rename): kb → kebab — binary, env vars, XDG paths, file renames
두 번째 commit. 사용자 facing surface (CLI binary, env vars, XDG paths)
+ 코드 안 single-letter token (`KB_`, `kb.sqlite`, `/kb/`, tracing
target) 일괄 rename. 그리고 3 개 file rename:

- 디자인 doc `2026-04-27-kb-final-form-design.md` →
  `2026-04-27-kebab-final-form-design.md`
- 최초 보고서 `kb_local_rust_report.md` → `kebab_local_rust_report.md`
- workspace ignore `.kbignore` → `.kebabignore`

## 변경

- `crates/kebab-cli/Cargo.toml`: `[[bin]] name = "kb"` → `"kebab"`.
- `crates/kebab-cli/src/main.rs`: `#[command(name = "kb", …)]` →
  `name = "kebab"`.
- 모든 `KB_*` env var (코드 + doc + 테스트) → `KEBAB_*`. apply_env
  prefix 매칭 + 30+ 개 setting 키 모두.
- XDG paths: `~/.config/kb` / `~/.local/share/kb` / `~/.cache/kb` /
  `~/.local/state/kb` → `~/.config/kebab` 등. config defaults +
  expand_path tests + paths.rs 의 hardcode 모두.
- SQLite filename: `kb.sqlite` → `kebab.sqlite` (`SQLITE_FILE` const
  + 테스트 hardcode 모두).
- tracing target: `target: "kb-*"` → `"kebab-*"` (10+ 곳).
- snapshot fixture: `.kbignore` → `.kebabignore` (`fixtures/source-fs/
  tree-1.snapshot.json` 갱신).

## 검증

- `cargo test --workspace -j 1` clean (linker OOM 회피 위해 직렬).
- `cargo clippy --workspace --all-targets -- -D warnings` clean.

다음 commit 에서 docs sweep.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 04:01:35 +00:00
kb
9fa38543a8 refactor(spec): introduce kb-parse-types thin crate
PR #1 review left a design-debt note: ParsedBlock landing in kb-core would
(a) force every crate to recompile on parser-internal changes, and
(b) cause namespace pollution when P6/P7/P8 parsers add their own variants.

Resolution: a new thin crate kb-parse-types sits between kb-core and parsers.
Owns ParsedBlock + ParsedPayload + Warning + forward-refs for image/pdf/audio
parser intermediates. Depends on kb-core only (for SourceSpan / Inline).

Updates:
- design §3.7b: add new section defining kb-parse-types
- design §8: add kb-parse-types to module-boundary diagram + forbidden list
- design §3.4 Inline stays in kb-core; kb-parse-types references it (no duplication)
- p0-1 skeleton: workspace + Cargo deps + public surface block
- p1-3 parse-md-blocks: outputs Vec<kb_parse_types::ParsedBlock> directly
- p1-4 normalize: Allowed gains kb-parse-types, drops cross-coupling note
- INDEX + phase-0 epic: list kb-parse-types in P0 deliverables
2026-04-27 20:41:35 +00:00