spec(p9-fb-01..03): ingest progress events + cancellation in §2.4a / §10

도그푸딩 후 추가된 long-running 작업 진행 표시 + cancel 정책을 frozen
design 에 명시. p9-fb-01/02/03 (ingest progress callback / CLI display
/ TUI background) 의 spec PR — impl PR 들이 이어진다.

변경:
- docs/wire-schema/v1/ingest_progress.schema.json (신규):
  line-delimited streaming event schema. discriminated by `kind`
  (scan_started → scan_completed → asset_started → asset_finished* →
  embed_batch_* → completed | aborted). 마지막 줄은 기존
  ingest_report.v1 그대로 (외부 wrapper backward-compat).
- 2026-04-27-kebab-final-form-design.md §2.4a (신규):
  IngestProgressEvent 절. 이벤트 ordering / aborted 의 idempotency /
  CLI 의 stderr vs stdout 분리 / TUI · desktop 의 in-memory 소비.
- 2026-04-27-kebab-final-form-design.md §10:
  long-running 작업 (ingest, future eval run, RAG streaming, embed
  batch) 의 두 invariant — progress 의 단일 source / cooperative
  cancel + step boundary. trait (§7.2) 시그니처는 무영향 — facade
  hidden parameter 로 추가.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-02 19:14:37 +00:00
parent 8d8544546c
commit 5ef8598e5c
2 changed files with 83 additions and 0 deletions

View File

@@ -0,0 +1,51 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://kb.local/wire/v1/ingest_progress.schema.json",
"title": "IngestProgressEvent v1",
"description": "Streaming progress event emitted by `kebab ingest --json`. One event per line (line-delimited JSON). Discriminated by `kind`. The terminal events are `completed` and `aborted` — every ingest run ends with exactly one of them. The final stdout line of a `--json` ingest is still the existing `ingest_report.v1` for backwards compatibility; progress events stream above it.",
"type": "object",
"required": ["schema_version", "kind", "ts"],
"properties": {
"schema_version": { "const": "ingest_progress.v1" },
"kind": {
"type": "string",
"enum": [
"scan_started",
"scan_completed",
"asset_started",
"asset_finished",
"embed_batch_started",
"embed_batch_finished",
"completed",
"aborted"
]
},
"ts": { "type": "string", "description": "RFC 3339 timestamp at the moment the event was emitted." },
"root": { "type": "string", "description": "scan_started: workspace root being walked." },
"total": { "type": "integer", "minimum": 0, "description": "scan_completed / asset_started / asset_finished: total assets discovered." },
"idx": { "type": "integer", "minimum": 1, "description": "asset_started / asset_finished: 1-based index of the current asset within the scan." },
"path": { "type": "string", "description": "asset_started: workspace-relative path of the asset being processed." },
"media": { "type": "string", "description": "asset_started: media kind label (e.g. `markdown`, `pdf`, `image`)." },
"kind_result": {
"type": "string",
"enum": ["new", "updated", "skipped", "error"],
"description": "asset_finished: per-asset outcome (mirrors `ingest_report.v1.items[].kind`)."
},
"chunks": { "type": "integer", "minimum": 0, "description": "asset_finished: chunk count produced for this asset." },
"n_chunks": { "type": "integer", "minimum": 0, "description": "embed_batch_started / embed_batch_finished: chunks in this embedding batch." },
"ms": { "type": "integer", "minimum": 0, "description": "embed_batch_finished: wall-clock duration of the batch." },
"counts": {
"type": "object",
"description": "completed / aborted: aggregate counters at the moment the run ended (mirrors fields on `ingest_report.v1`).",
"properties": {
"scanned": { "type": "integer", "minimum": 0 },
"new": { "type": "integer", "minimum": 0 },
"updated": { "type": "integer", "minimum": 0 },
"skipped": { "type": "integer", "minimum": 0 },
"errors": { "type": "integer", "minimum": 0 },
"chunks_indexed": { "type": "integer", "minimum": 0 },
"embeddings_indexed": { "type": "integer", "minimum": 0 }
}
}
}
}