- Cmd::Config { Migrate { --dry-run } }, --json 시 config_migration.v1.
- wire_config_migration (ConfigMigrationReport 가 schema_version 자체 보유).
- schema.rs WIRE_SCHEMAS 에 config_migration.v1 등록 + JSON schema 파일.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two new wire schemas land as additive minor: ocr_stats.v1 (corpus-wide
aggregate — total_events, success_rate, p50/p90/p99/max_ms, by_engine,
top-10 by_doc by failure count) and ocr_failures.v1 (per-doc or
corpus-wide recent failures, with --doc-id + --limit). Both ship via
new CLI subcommands `kebab inspect ocr-stats` / `inspect ocr-failures`.
App gains four facade methods: inspect_ocr_stats /
inspect_ocr_failures plus their *_with_config companions — required by
CLAUDE.md "the facade rule" so `--config <path>` is honored. The CLI
dispatch arms thread cfg explicitly into the _with_config form.
Runtime introspection emit (WIRE_SCHEMAS in schema.rs) gains two
entries; the meta JSON Schema (schema.schema.json) is untouched
because its wire.schemas is pattern-based, not enum-based.
ingest_log::percentiles extended to (p50, p90, p99, max). p99 surfaces
only via inspect ocr-stats; IngestSummary (round 1) stays 3-percentile.
SKILL.md synced with the two new schemas (AC-13).
Closure r2 G2 (facade *_with_config pair) + G3 (runtime emit, not
meta schema file) + closure r1 F4 (p99) resolved.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
이전: schema.v1.models 가 parser_version / chunker_version 단일 값만 보고 →
multi-medium corpus (md + pdf + code Rust/Python + dockerfile + k8s + manifest)
의 version cascade audit 누락 risk.
이후: additive minor — Models struct 에 active_parsers + active_chunkers Vec<String>
추가. backward compat: 기존 단일 field 보존 (markdown default), 신규 array 는
optional (#[serde(default)] + JSON schema required 미포함).
source:
- kebab_store_sqlite::fetch_distinct_parser_versions() 가
documents.parser_version DISTINCT + ORDER BY 반환.
- fetch_distinct_chunker_versions() 가 chunks.chunker_version 동일 pattern.
- collect_models 가 매 schema 호출마다 재계산 (cache 없음 — R-3 자동 해결).
wire schema additive only — 메이저 bump 불필요. v0.20.1 minor 로 충분.
integrations/claude-code/kebab/SKILL.md 동기 갱신.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
capabilities_snapshot() 가 streaming_ask + single_file_ingest 를 hardcoded false 로
보고했으나 실제 구현은 v0.20 final-dogfood 에서 production-grade:
- kebab ask --stream → answer_event.v1 ndjson 191 event 정상 emit
- kebab ingest-file <path> / kebab ingest-stdin --title <T> → ingest_report.v1 정상
MCP host + Claude Code skill 등 agent 가 schema.capabilities 로 routing 결정 시
false negative → 사용자가 실제 동작 feature 를 사용 불가능하다고 오인.
http_daemon 은 false 유지 (별도 sub-item 의 non-impl).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
실제로 doc count 임을 명시 (PR #161 워커 리뷰 MEDIUM 반영)
JSON schema description 은 PR-C 본체에서 'code chunk count' →
'doc count' 로 정정했으나 Rust struct field 의 rustdoc 은 같은
오기재를 그대로 carry — Gemini round 2 가 JSON schema 만 봤고
rustdoc 은 miss. 워커 둘 다 동일 finding (MEDIUM).
implementation 변경 없음 — 의미가 doc count 였던 사실이 처음부터
일관. wording 만 맞춤.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dogfooding (PR #142 1B + multi-root corpus: kebab-docs + httpx + zod + lodash)
revealed schema.v1.repo_breakdown is always {} despite the 1A-2 Task 9
having added the code_lang_breakdown sibling. The schema.rs:171 placeholder
`BTreeMap::new()` was left in place. Mirror Task 9's code_lang_breakdown
query for the repo field — same metadata_json JSON-path pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `SqliteStore::code_lang_breakdown()` that queries
`json_extract(metadata_json, '$.code_lang')`, groups by it, and
skips NULL rows — returning `BTreeMap<String, u32>`.
Wire it into `collect_stats` in `kebab-app::schema`, replacing the
`BTreeMap::new()` placeholder inserted by 1A-1.
Test: `store::tests::code_lang_breakdown_counts_by_code_lang` asserts
rust=1 and that a null-code_lang doc does NOT appear in the map.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 13: add wire regression tests proving markdown SearchHit omits
repo/code_lang when None, and all 5 original Citation variants serialize
byte-identically without spurious Code-variant keys.
Task 15: add --repo (repeatable) and --code-lang (repeatable,
comma-separated) flags to `kebab search`; propagate both into
SearchFilters instead of the previous vec![] stub. Add
#[allow(clippy::large_enum_variant)] — Cmd is short-lived, boxing buys
nothing.
Task 16: add code_lang_breakdown and repo_breakdown BTreeMap fields to
Stats (schema.v1); derive Default on Stats; populate both as empty in
collect_stats (1A-2 fills them when code chunks land). Add unit test
asserting both keys are present in the serialized object.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends CountSummary with media_breakdown, lang_breakdown, stale_doc_count
fields populated via stats_ext::breakdowns(). Adds count_summary_with_threshold
for callers that need real stale counts. Mirrors all new fields onto the
wire-bound Stats struct in kebab-app::schema with #[serde(default)] for
backwards-compat. Also fixes search_budget_integration.rs for the trace field
added to SearchOpts in Task 1.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
JSON output wrapped in search_response.v1 (breaking — agent must
adapt). Plain output unchanged + [truncated; use --cursor X]
stderr hint when budget tripped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- schema.rs: extract `SCHEMA_V1_ID` const + re-export via kebab-app::lib.rs.
wire.rs::wire_schema 의 2 literal 도 import 해서 single source of truth.
- schema.rs::collect_models: parser_version 가 markdown 만 surface 함을
주석으로 명시 (PDF/image extractor 의 자체 version 은 SchemaV1.models 가
multi-medium map 으로 진화 시 surface).
- main.rs::print_schema_text: 헤더 줄 끝의 `\n` 제거 + `println!()` 추가 —
다른 section 들과 패턴 일관.
- error_classify.rs::llm_unreachable_classifies: timeout 50ms → 500ms (10x
headroom) + 접근 방식 + 한계 주석 추가.
- HOTFIXES: open_existing 의 RW flag + 주석-only enforcement 갭을
Known-limitation 에 명시.
Round 1 review summary: #104 (comment)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace kebab-app's private `KEBAB_PARSE_MD_VERSION` literal with a
direct reference to `kebab_parse_md::PARSER_VERSION` so the parser
version cascade has a single source of truth (design §9 invariant).
Add maintenance comment on schema.rs WIRE_SCHEMAS const pointing to
docs/wire-schema/v1/ + kebab-cli wire helpers as the authoritative
sources to keep in sync.
Tighten open_existing doc comment to match the actual SQLITE_OPEN_READ_WRITE
flag (needed for WAL pragma application) — callers should still avoid
issuing mutations through this connection.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>