kebab

Author	SHA1	Message	Date
th-kim0823	56f20b7235	plan(fb-38): score semantics implementation plan 7 tasks: kebab-core ScoreKind enum + SearchHit field, lexical Bm25 labeling, vector Cosine, hybrid Rrf + search_with_trace pass-through, cross-crate SearchHit literal cleanup, CLI integration test, docs (wire schema + README + design + SKILL + INDEX). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 17:45:57 +09:00
th-kim0823	0359bd9682	spec(fb-38): score semantics design - search_hit.v1 에 optional score_kind 필드 (rrf \| bm25 \| cosine) - LexicalRetriever → Bm25, VectorRetriever → Cosine, HybridRetriever → Rrf - fb-37 search_with_trace 의 mode-dispatch hits 는 underlying retriever 의 score_kind 그대로 보존 - README + design §4 + SKILL 에 RRF 수식 전체 + "ranking signal, NOT confidence" 안내, agent 용 trust threshold 는 nested retrieval.{lexical,vector}_score - additive minor wire — schema bump 없음 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 17:40:47 +09:00
altair823	cf3acfc136	Merge pull request 'chore: bump version 0.4 → 0.5' (#130 ) from chore/bump-v0.5.0 into main Reviewed-on: #130 v0.5.0	2026-05-10 08:08:06 +00:00
th-kim0823	668e1174cc	chore: bump version 0.4 → 0.5 v0.5.0 batches fb-32 (stale doc indicator) + fb-33 (streaming ask) + fb-34 (output budget controls) + fb-35 (verbatim fetch) + fb-36 (search filter args) + fb-37 (trace + stats). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 17:04:51 +09:00
altair823	745a75a82b	Merge pull request 'feat(fb-37): trace + stats — search debug + KB health surface' (#129 ) from feat/fb-37-trace-and-stats into main Reviewed-on: #129	2026-05-10 07:59:56 +00:00
th-kim0823	6a33d08aea	fix(fb-37): address PR #129 round 1 review - doc TraceFusionInput.fusion_score semantics (single-mode vs hybrid) - comment why total_ms vs stage sum can drift (millis truncation) - TODO marker on TUI trace popup filter passthrough Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 16:26:34 +09:00
th-kim0823	a40593590b	docs(fb-37): wire schema + README + SMOKE + INDEX + SKILL	2026-05-10 14:13:47 +09:00
th-kim0823	5687cbc0e2	feat(tui): search pane t-key opens TracePopup (fb-37)	2026-05-10 13:39:11 +09:00
th-kim0823	653e432a30	feat(mcp): kebab__search trace input + output mirror (fb-37) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-10 13:32:30 +09:00
th-kim0823	f7e2072d66	test(cli): integration tests for --trace + schema breakdowns (fb-37) Also fixes App::search_with_opts trace branch to use NoopRetriever for SearchMode::Lexical, removing the embeddings requirement when the user only wants lexical-mode trace.	2026-05-10 13:21:33 +09:00
th-kim0823	72c227af23	feat(cli): kebab search --trace flag + wire trace + pretty print (fb-37) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-10 13:08:48 +09:00
th-kim0823	69037c313a	feat(app): SearchResponse.trace + opts.trace threading (fb-37) Adds the `trace: Option<SearchTrace>` field to `SearchResponse` and threads `SearchOpts.trace` through `App::search_with_opts`. When the caller sets `opts.trace = true` the path bypasses the LRU search cache and runs through `HybridRetriever::search_with_trace`, which dispatches all 3 SearchModes internally; this means `--trace` requires embeddings (same constraint as `--mode hybrid`). The non-trace path keeps its exact prior behavior with `trace: None` stamped on the response. Picked up Task 1 / Task 3 follow-ups in the same commit so the workspace compiles: SearchOpts struct-literals in kebab-cli/main.rs + kebab-mcp/tools/search.rs default the new `trace` field to false, and the schema-wrapper test in kebab-cli/wire.rs fills the new media_breakdown / lang_breakdown / index_bytes / stale_doc_count fields on Stats with `Default::default()`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 13:01:18 +09:00
th-kim0823	6a067e3ab1	feat(search): HybridRetriever::search_with_trace (fb-37)	2026-05-10 12:38:53 +09:00
th-kim0823	231d80e82d	feat(stats): media/lang/bytes/stale fields on schema.v1.stats (fb-37) Extends CountSummary with media_breakdown, lang_breakdown, stale_doc_count fields populated via stats_ext::breakdowns(). Adds count_summary_with_threshold for callers that need real stale counts. Mirrors all new fields onto the wire-bound Stats struct in kebab-app::schema with #[serde(default)] for backwards-compat. Also fixes search_budget_integration.rs for the trace field added to SearchOpts in Task 1. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-10 12:34:57 +09:00
th-kim0823	69c6e23432	feat(store): breakdowns + index_bytes helpers (fb-37) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-10 12:24:43 +09:00
th-kim0823	1e943f21dc	feat(core): SearchTrace + IndexBytes types + SearchOpts.trace (fb-37) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-10 12:17:04 +09:00
th-kim0823	fb31befef1	plan(fb-37): trace + stats implementation plan 10 tasks: kebab-core types, store breakdowns/index_bytes helpers, extended CountSummary + Stats wire mirror, HybridRetriever search_with_trace, App SearchResponse.trace threading, CLI --trace flag, integration tests, MCP SearchInput.trace, TUI TracePopup, docs (wire schema + README + SMOKE + INDEX + SKILL). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 12:14:26 +09:00
th-kim0823	5f6b2fa259	spec(fb-37): trace + stats design - search --trace boolean flag, additive optional `trace` field on search_response.v1 - HybridRetriever search_with_trace returns (hits, SearchTrace) — lex/vec/rrf_inputs + per-stage timing - cache bypass when --trace (debug intent) - schema.v1.stats extended with media_breakdown / lang_breakdown / index_bytes / stale_doc_count - TUI search pane `t` keystroke opens TracePopup - additive minor wire — no schema bump Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 12:05:31 +09:00
altair823	a0497d9c53	Merge pull request 'chore: sync Cargo.lock for kebab-mcp time dep (fb-36)' (#128 ) from chore/sync-cargo-lock-fb36 into main Reviewed-on: #128	2026-05-10 02:09:03 +00:00
th-kim0823	b221686133	chore: sync Cargo.lock for kebab-mcp time dep (fb-36) PR #127 added time = { workspace = true } to kebab-mcp/Cargo.toml but Cargo.lock entry was not regenerated before merge. cargo build on main locally regenerates the +time line under kebab-mcp. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 11:04:17 +09:00
altair823	a72c6f307c	Merge pull request 'feat(fb-36): search filter args (--media / --ingested-after / --doc-id + 4 existing)' (#127 ) from feat/fb-36-search-filters into main Reviewed-on: #127	2026-05-10 02:02:24 +00:00
th-kim0823	84287d0ef6	fix(fb-36): address PR #127 round 1 review - ingested_after: convert OffsetDateTime to UTC before formatting so non-Z offsets compare correctly against UTC TEXT storage (lexical.rs + filters.rs) - README: --tag is repeatable-only, not csv (only --media is csv) - test(cli): add multi-value --tag OR-within IN-list coverage - test(store): add UTC-offset regression test for ingested_after - mcp: use ERROR_V1_ID const instead of hardcoded "error.v1" Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 04:47:55 +09:00
th-kim0823	6e7446861b	docs(fb-36): README + SMOKE + INDEX + skill notes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 04:26:27 +09:00
th-kim0823	b06f4654e7	feat(mcp): kebab__search filter inputs (fb-36) 7 new optional inputs on SearchInput: tags, lang, path_glob, trust_min, media, ingested_after, doc_id. Validation surfaces as error.v1 code = invalid_input via StructuredError. Dispatch builds SearchFilters from the inputs and forwards through the existing search_with_opts_with_config facade. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 04:11:27 +09:00
th-kim0823	4e0379c04f	test(cli): wire_search_filters — lexical-only integration tests (fb-36) Cover: --doc-id scoping, --ingested-after validation error, --media md alias, --tag repeatable + frontmatter parsing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 04:06:21 +09:00
th-kim0823	6a18847892	feat(cli): kebab search filter flags (fb-36) 7 new flags: --tag (repeatable), --lang, --path-glob, --trust-min (value_enum), --media (csv with `md` alias), --ingested-after (RFC3339; config_invalid on parse fail), --doc-id. Dispatch translates clap values into SearchFilters and propagates structured errors through the existing StructuredError wrapper from fb-34. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 03:57:55 +09:00
th-kim0823	c6cc1e2bfe	feat(search/vector): media / ingested_after / doc_id filters (fb-36) filter_chunks helper in kebab-store-sqlite extended with the same 3 WHERE clauses as lexical. Vector still over-fetches k*2 then post-filters via SqliteStore::filter_chunks; small k can return < k hits when filters drop a lot — agent is expected to widen k or paginate. AND combinator with existing filters. - kebab-store-sqlite/src/filters.rs: media IN-list subquery, ingested_after lexicographic >= compare, doc_id equality; mirrors lexical SQL arms - 3 direct unit tests (filter_chunks_media_type/ingested_after/doc_id) that run without AVX/Lance - common/mod.rs: insert_doc / insert_doc_with_media / run_vector_search helpers on HybridEnv for integration-test use - hybrid.rs: 2 new #[ignore = "requires AVX..."] integration tests (vector_filter_by_media, vector_filter_by_doc_id) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 03:50:56 +09:00
th-kim0823	86475e5ba2	fix(search/lexical): use std::iter::repeat_n (clippy) Per code review on `2c80e2a`. manual-repeat-n lint triggers for Rust 1.94+ when repeat().take() can be expressed as repeat_n directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 03:43:51 +09:00
th-kim0823	2c80e2ad91	feat(search/lexical): media / ingested_after / doc_id filters (fb-36) SQL WHERE clause extension. media uses CASE WHEN json_type='text' to handle both unit (\`"markdown"\`) and tuple (\`{"image":"png"}\`) MediaType serde shapes. ingested_after relies on RFC3339 lexicographic ordering with UTC Z (per fb-32 ingest invariant). doc_id is a simple equality. AND combinator with existing tags / lang / trust filters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 03:41:02 +09:00
th-kim0823	d3f38c76e9	feat(core): SearchFilters gains media / ingested_after / doc_id (fb-36) 3 additive optional fields. #[serde(default)] preserves backwards compat for older JSON without the new keys. MEDIA_KINDS const exposes canonical "markdown"/"pdf"/"image"/ "audio"/"other" labels for downstream alias normalization. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 03:36:45 +09:00
th-kim0823	31c1e05951	plan(fb-36): search filter args implementation plan 9 tasks: SearchFilters extension, lexical SQL WHERE, vector filter_chunks mirror, CLI 7 flags, integration tests, MCP SearchInput extension, workspace test/clippy, docs, smoke+PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 03:34:39 +09:00
th-kim0823	7210386699	spec(fb-36): search filter args — design `kebab search` 에 7 flag 노출 (기존 4 + 신규 3): - --tag (반복) / --lang / --path-glob / --trust-min (기존 SearchFilters) - --media (csv) / --ingested-after (RFC3339) / --doc-id (신규) filter layer = SQLite WHERE (lexical) + over-fetch+post-filter (vector). AND 결합. wire schema 무변경 (input only). `SearchFilters` 3 필드 additive (#[serde(default)] 로 backwards- compat). MCP SearchInput 7 optional 필드 추가. invalid RFC3339 → error.v1.code = config_invalid. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 03:26:40 +09:00
altair823	a7115be699	Merge pull request 'feat(fb-35): verbatim fetch (chunk / doc / span)' (#126 ) from feat/fb-35-verbatim-fetch into main Reviewed-on: #126	2026-05-09 16:09:48 +00:00
th-kim0823	b86b763dfb	fix(fb-35): address PR #126 round 2 review - wire schema: relax effective_end.minimum 1 → 0 + expand description to cover line-clamp + out-of-range sentinel (panic-fix R1 emits Some(0) when line_start=1 and range is beyond doc end — schema must accept it) - tests: tighten first-chunk-target boundary test to assert ≤ 2 total neighbors (3-chunk doc, N=2). Strict "first chunk → context_before empty" not assertable until chunks.ordinal column lands (R1 #9 architectural caveat) - store: trim contradiction in list_chunk_ids_for_doc warning comment — drop "good enough for sequentially chunked markdown" phrase that conflicts with "hash sort dominates" paragraph above Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:55:29 +09:00
th-kim0823	7dddc1d706	fix(fb-35): address PR #126 round 1 review - fetch_span: panic-fix on line_start > total / empty doc (return empty text + effective_end = line_start - 1 instead of out-of-bounds slice) - truncated: reserved for budget-driven truncation only; line range clamp signaled via effective_end < line_end - spec / SKILL.md / README: align rejection wording to "PDF / audio" (matches code; Image OCR allowed for span) - store: warning comment on list_chunk_ids_for_doc — chunk_id hash sort does NOT preserve document position; real fix is a chunks.ordinal column, tracked as follow-up - surrounding_chunks: saturating_add to defend against u32::MAX context arg on 32-bit targets - tests: line_start > total returns empty + chunk context at doc boundary clamps lower bound Deferred nits (follow-up): table-separator strict CommonMark form; MCP per-mode strict validation; CLI chunk_id truncation in plain output. None block correctness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:45:29 +09:00
th-kim0823	2a6b3dc7e6	docs(fb-35): README + SMOKE + INDEX + skill notes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:21:35 +09:00
th-kim0823	8d8f1c0294	test(cli): bump expected MCP tool count 6 → 7 for fb-35 fetch cli_mcp_initialize_then_tools_list asserts the exact tools[] count returned by tools/list. fb-35 added kebab__fetch as the 7th tool — bump the assertion accordingly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:20:59 +09:00
th-kim0823	77bf19566c	feat(mcp): kebab__fetch tool — chunk / doc / span (fb-35) Mirrors CLI surface: same input shape, same fetch_result.v1 output. invalid_input error for missing kind-specific fields. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:11:37 +09:00
th-kim0823	beb40249a3	test(cli): wire_fetch — chunk/doc + chunk_not_found integration (fb-35) 3 lexical-only integration tests: chunk JSON shape, doc truncated with --max-tokens, unknown chunk_id returns error.v1 with code = chunk_not_found. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:06:14 +09:00
th-kim0823	0fffd69071	feat(cli): kebab fetch chunk / doc / span (fb-35) JSON output is fetch_result.v1; plain output is human-friendly labeled sections (chunk: before / target / after; doc/span: full text + stderr truncated hint). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:01:56 +09:00
th-kim0823	1b9d89eb3a	feat(app): App::fetch span mode + PDF/audio rejection (fb-35) Line-based slice over fmt_canonical_to_markdown output. PDF / audio source_type → span_not_supported StructuredError. Out-of-range line_end clamps to total; effective_end reflects post-budget trim. invalid_input on zero / inverted bounds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 23:54:22 +09:00
th-kim0823	7d1f855f7e	feat(app): App::fetch doc mode with budget (fb-35) Walks CanonicalDocument blocks, serializes to markdown, applies chars/4 budget when opts.max_tokens is set. doc_not_found preserved through StructuredError. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 23:48:40 +09:00
th-kim0823	610d29f053	feat(app): App::fetch chunk mode + markdown serializer (fb-35) Chunk mode + +-N context. doc / span modes return placeholder errors (filled by subsequent tasks). fmt_canonical_to_markdown helper introduced now since doc mode (Task 4) consumes it. Errors are typed StructuredError so classify preserves chunk_not_found / doc_not_found through the wire layer. Adds SqliteStore::list_chunk_ids_for_doc so the facade can derive +-N neighbors without leaking direct rusqlite usage into kebab-app. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 23:44:51 +09:00
th-kim0823	75eeae3933	feat(wire): fetch_result.v1 schema (fb-35) Discriminated by kind (chunk / doc / span). Per-kind required fields enforced by description prose at v1 stub stage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 23:36:19 +09:00
th-kim0823	9653592c16	feat(core): FetchQuery / FetchOpts / FetchResult / FetchKind (fb-35) Domain types for `kebab fetch` 3 modes (chunk / doc / span). All types Serialize so wire layers hand them through serde_json directly. FetchKind is snake_case-renamed to match the wire discriminator literal in fetch_result.v1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 23:35:21 +09:00
th-kim0823	353aa5cc78	plan(fb-35): verbatim fetch implementation plan 11 tasks: domain types, wire schema, App::fetch chunk/doc/span modes (3 separate tasks for incremental TDD), CLI subcommand, CLI integration tests, MCP tool, workspace+clippy gate, docs, smoke+PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 23:31:29 +09:00
th-kim0823	4eda9c317d	spec(fb-35): verbatim fetch — design `kebab fetch chunk\|doc\|span` 신규 subcommand + MCP `kebab__fetch` tool. wire = `fetch_result.v1` (kind discriminator). source = CanonicalDocument / chunks.text 정규화된 markdown (raw bytes 미노출). chunk mode `--context N` = ordinal ±N. doc/span mode = fb-34 budget 재사용 (chars/4). PDF/audio span 은 `error.v1.code = span_not_supported` 거절. 신규 error codes: chunk_not_found / doc_not_found / span_not_supported / invalid_input. fb-34 StructuredError wrapper 재사용. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 23:21:01 +09:00
altair823	9817a3de59	Merge pull request 'feat(fb-34): output budget controls' (#125 ) from feat/fb-34-output-budget-controls into main Reviewed-on: #125	2026-05-09 12:52:36 +00:00
th-kim0823	e084b306e5	fix(fb-34): align next_cursor semantics with docs (PR #125 round 2) Previous round-1 fix dropped the speculative cursor branch on the truncated path, leaving a contradiction with the docs: - snippet-only shrunk → cursor emitted (returned == k_effective) - k-popped → cursor null (returned < k_effective) But docs promised the opposite. R2 resolution: emit cursor whenever more hits may be reachable (either retriever filled the page OR budget popped hits — the popped ones remain fetchable from offset+returned). Drop the artificial "widen vs paginate" copy; truncated and next_cursor are now independent signals — caller may do either or both. Updates: app.rs::search_with_opts logic + SearchResponse doc + schema description + SKILL.md two bullets + max_tokens=0 test asserts cursor IS emitted on k-pop case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 21:07:04 +09:00
th-kim0823	f485608108	fix(fb-34): address PR #125 round 1 review - error_wire: StructuredError wrapper preserves ErrorV1 through anyhow → classify pipeline. Adds downcast short-circuit so cursor::decode's typed code = "stale_cursor" reaches the wire instead of being string-formatted to code = "generic". - app: search_with_opts now wraps cursor::decode error in StructuredError instead of anyhow! string format. - test: error_wire pins both negative (bare anyhow → not stale_cursor) AND positive (StructuredError → stale_cursor) invariants. CLI integration test runs end-to-end and asserts error.v1.code on stderr. - app: next_cursor only emitted on full-page (k-pop) path; drop speculative emit on snippet-only truncation that would point at a different page than the agent expected. - cursor: differentiate malformed-base64 / malformed-payload / revision-mismatch error messages; all keep code = stale_cursor. - test: cursor_rejected fixture uses .expect() to fail loud on cursor non-emission instead of silent skip. - test: max_tokens=0 → 1-hit floor + truncated=true. - docs: SKILL.md + schema description distinguish snippet-shrink (widen) vs k-pop (paginate) truncated cases. HOTFIXES notes --no-cache semantic shift (cached path + clear vs uncached path). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 20:49:27 +09:00

1 2 3 4 5 ...

579 Commits