Task 13: add wire regression tests proving markdown SearchHit omits
repo/code_lang when None, and all 5 original Citation variants serialize
byte-identically without spurious Code-variant keys.
Task 15: add --repo (repeatable) and --code-lang (repeatable,
comma-separated) flags to `kebab search`; propagate both into
SearchFilters instead of the previous vec![] stub. Add
#[allow(clippy::large_enum_variant)] — Cmd is short-lived, boxing buys
nothing.
Task 16: add code_lang_breakdown and repo_breakdown BTreeMap fields to
Stats (schema.v1); derive Default on Stats; populate both as empty in
collect_stats (1A-2 fills them when code chunks land). Add unit test
asserting both keys are present in the serialized object.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds five new u32 counters (skipped_gitignore, skipped_kebabignore,
skipped_builtin_blacklist, skipped_generated, skipped_size_exceeded)
and a SkipExamples struct (≤5 sample paths per category) to
IngestReport. All new fields are #[serde(default)] for backward-compat
deserialization. Downstream literal construction sites patched with
zeros/empty; snapshot re-baked.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire two new optional fields onto SearchHit (skip_serializing_if = None)
and two Vec<String> filter fields onto SearchFilters (serde default).
Add RetrievalDetail::Default impl (manual, uses SearchMode::Hybrid as
sentinel). Patch all downstream SearchHit / SearchFilters literal
constructors with repo: None / code_lang: None / vec![] as appropriate.
Also covers Citation::Code arm in kebab-eval metrics match.
Also fixes App::search_with_opts trace branch to use NoopRetriever
for SearchMode::Lexical, removing the embeddings requirement when
the user only wants lexical-mode trace.
Adds the `trace: Option<SearchTrace>` field to `SearchResponse` and
threads `SearchOpts.trace` through `App::search_with_opts`. When the
caller sets `opts.trace = true` the path bypasses the LRU search cache
and runs through `HybridRetriever::search_with_trace`, which dispatches
all 3 SearchModes internally; this means `--trace` requires embeddings
(same constraint as `--mode hybrid`). The non-trace path keeps its
exact prior behavior with `trace: None` stamped on the response.
Picked up Task 1 / Task 3 follow-ups in the same commit so the
workspace compiles: SearchOpts struct-literals in kebab-cli/main.rs +
kebab-mcp/tools/search.rs default the new `trace` field to false, and
the schema-wrapper test in kebab-cli/wire.rs fills the new
media_breakdown / lang_breakdown / index_bytes / stale_doc_count fields
on Stats with `Default::default()`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- ingested_after: convert OffsetDateTime to UTC before formatting
so non-Z offsets compare correctly against UTC TEXT storage
(lexical.rs + filters.rs)
- README: --tag is repeatable-only, not csv (only --media is csv)
- test(cli): add multi-value --tag OR-within IN-list coverage
- test(store): add UTC-offset regression test for ingested_after
- mcp: use ERROR_V1_ID const instead of hardcoded "error.v1"
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 new flags: --tag (repeatable), --lang, --path-glob,
--trust-min (value_enum), --media (csv with `md` alias),
--ingested-after (RFC3339; config_invalid on parse fail),
--doc-id. Dispatch translates clap values into SearchFilters
and propagates structured errors through the existing
StructuredError wrapper from fb-34.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cli_mcp_initialize_then_tools_list asserts the exact tools[]
count returned by tools/list. fb-35 added kebab__fetch as the
7th tool — bump the assertion accordingly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JSON output is fetch_result.v1; plain output is human-friendly
labeled sections (chunk: before / target / after; doc/span: full
text + stderr truncated hint).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- error_wire: StructuredError wrapper preserves ErrorV1 through
anyhow → classify pipeline. Adds downcast short-circuit so
cursor::decode's typed code = "stale_cursor" reaches the wire
instead of being string-formatted to code = "generic".
- app: search_with_opts now wraps cursor::decode error in
StructuredError instead of anyhow! string format.
- test: error_wire pins both negative (bare anyhow → not
stale_cursor) AND positive (StructuredError → stale_cursor)
invariants. CLI integration test runs end-to-end and asserts
error.v1.code on stderr.
- app: next_cursor only emitted on full-page (k-pop) path; drop
speculative emit on snippet-only truncation that would point at
a different page than the agent expected.
- cursor: differentiate malformed-base64 / malformed-payload /
revision-mismatch error messages; all keep code = stale_cursor.
- test: cursor_rejected fixture uses .expect() to fail loud on
cursor non-emission instead of silent skip.
- test: max_tokens=0 → 1-hit floor + truncated=true.
- docs: SKILL.md + schema description distinguish snippet-shrink
(widen) vs k-pop (paginate) truncated cases. HOTFIXES notes
--no-cache semantic shift (cached path + clear vs uncached path).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JSON output wrapped in search_response.v1 (breaking — agent must
adapt). Plain output unchanged + [truncated; use --cursor X]
stderr hint when budget tripped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- pipeline: refresh module docstring step 5 to reflect new cancel
semantics (RetrievalDone/Token/Final + LlmStreamAborted)
- wire schema: spell out refusal-path behavior in answer_event.v1
description (only retrieval_done emitted; no final)
- test: factual comment on relax_score_gate-using test corrected
- test: new Ollama-gated stream_score_gate_refusal_emits_only_retrieval_done
- test: new ask_emits_no_final_when_cancelled_mid_stream pinning
the no-Final invariant on cancel
- pipeline: large_enum_variant comment broadened to acknowledge
RetrievalDone.hits as the dominant per-emit cost
- HOTFIXES: log AskOpts.stream_sink internal API break per spec
contract policy
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three Ollama-gated integration tests covering:
- stderr lines parse as answer_event.v1 (retrieval_done first,
final last, all carry RFC3339 ts).
- stdout final line is answer.v1 (backwards compat).
- non-stream path (--json without --stream) unchanged.
- BrokenPipe stderr → child terminates cleanly via cancel
propagation through pipeline SendError.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Background-thread driver runs ask_with_config; main thread
drains the receiver, serializes each StreamEvent to ndjson on
stderr. BrokenPipe → drop receiver → pipeline SendError →
cancel + LlmStreamAborted refusal. Final stdout line is the
existing answer.v1 (ingest_progress.v1 backwards-compat
pattern).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror of Task 9's search-output rendering: yellow [stale] on TTY,
plain text otherwise. JSON path inherits via serde on AnswerCitation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Yellow when TTY, plain when not. JSON path inherits via serde
on the domain type; no CLI-side wire change needed there.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
set_draw_target switching broke cursor positioning: each hidden→stderr
restore caused indicatif to draw a fresh line instead of overwriting.
Root fix: call only set_position() in TTY AssetStarted (one draw per
file). Filename visible in non-TTY plain-line output.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
set_position() and set_message() each call update_and_draw()
independently, producing two scrollback lines per file in TTY mode.
Suppress the draw target before the two updates, restore to stderr,
then call tick() to emit exactly one frame.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AssetStarted now advances position (idx-1) and sets message together.
AssetFinished no longer updates the bar — Completed handles final
cleanup via finish_and_clear. Result: one bar frame per file instead
of two, eliminating the scrollback duplicate-line artifact.
Add readonly/quiet fields to Cli, parse_bool_env for 1/true/yes/on support,
is_mutating guard that short-circuits with error.v1 on write-path commands,
and wire KEBAB_PROGRESS=plain through from_flags in the Ingest arm.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add `quiet: bool` to `Human` variant and expand `from_flags` to three
args (`json`, `quiet`, `plain_env`). Update `handle`/`handle_human`
accordingly; add four targeted unit tests (TDD).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fb-31 added ingest_file + ingest_stdin MCP tools (Task 9) but the
spawn-based smoke test in cli_mcp_smoke.rs still asserted the fb-30
count of 4. Bump to 6 to match the live tools/list response.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires kebab_mcp::serve_stdio into kebab-cli. `--config <path>` honored
via the established Config::load pattern.
Updated serve_stdio signature to (Config, Option<PathBuf>) so the doctor
tool's path-aware behavior works correctly via KebabAppState.
Smoke test spawns the binary + sends initialize + initialized +
tools/list over stdin, asserts 4 tools returned. Confirms the MCP
server boots end-to-end via the real binary (rmcp 1.6 has no
in-memory test transport, so this is the only end-to-end assertion).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 8 commit f9a1548 added `schema_version: String` as required field on
ErrorV1 (so kebab-mcp's direct serialize-then-emit path produces correct
error.v1 wire). The wire.rs ErrorV1 literal in the
error_wrapper_tags_schema_version_and_emits_code test was missed —
breaks kebab-cli build. Add the field to the test fixture.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fb-30 의 새 crate `kebab-mcp` 가 동일 classify 모듈 사용 — UI crate 끼리
import 는 facade rule 위반이므로 kebab-app 으로 promotion. fb-27 commit
c91228e 의 코드 그대로 이전 (struct + classify + classify_llm + 7 unit
test). reqwest dev-dep 도 함께 이동.
kebab-cli 는 `kebab_app::ErrorV1` / `kebab_app::classify` 로 import 경로
1줄 변경 + wire.rs 의 `&crate::error_classify::ErrorV1` 1줄 교체. 동작
무영향.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- schema.rs: extract `SCHEMA_V1_ID` const + re-export via kebab-app::lib.rs.
wire.rs::wire_schema 의 2 literal 도 import 해서 single source of truth.
- schema.rs::collect_models: parser_version 가 markdown 만 surface 함을
주석으로 명시 (PDF/image extractor 의 자체 version 은 SchemaV1.models 가
multi-medium map 으로 진화 시 surface).
- main.rs::print_schema_text: 헤더 줄 끝의 `\n` 제거 + `println!()` 추가 —
다른 section 들과 패턴 일관.
- error_classify.rs::llm_unreachable_classifies: timeout 50ms → 500ms (10x
headroom) + 접근 방식 + 한계 주석 추가.
- HOTFIXES: open_existing 의 RW flag + 주석-only enforcement 갭을
Known-limitation 에 명시.
Round 1 review summary: #104 (comment)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cli_schema: exercises `kebab schema` (text + --json) on a fresh-but-init'd
KB. Pins schema_version, kebab_version non-empty, capabilities.json_mode
true, capabilities.mcp_server false (future placeholder).
cli_error_wire: spawns `kebab --json --config <malformed.toml> ingest`
and verifies stderr emits a single error.v1 ndjson line with
code == "config_invalid". Non-JSON mode regression-pinned to keep the
legacy `error:` prefix. Note: --config /nonexistent silently falls back
to defaults (by design); a file that exists but fails TOML parsing is
the reliable trigger for config_invalid.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps the existing `Err(e)` arm with a `cli.json` branch:
- `--json`: stderr ndjson `error.v1` via wire_error_v1
- non-`--json`: legacy `error: <msg>` text path (unchanged)
exit_code() unchanged — RefusalSignal/NoHitSignal/DoctorUnhealthy
still drive 1/1/3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Text mode: doctor-style key/value layout. JSON mode: schema.v1 wire
record. Honors `--config <path>` via the established
`kebab_app::schema_with_config(&cfg)` facade pattern (per the P3-5 /
P4-3 regression conventions).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>