사용자 결정 (2026-05-02): \"README.md는 사용자가 가장 빠르게 이 앱을 사용할 수 있도록 하는 내용만 포함하자. mermaid 다이어그램으로 논리적인 아키텍처 다이어그램 하나 정도만 들어가면 충분할 것 같아\". 세 문서로 분리, audience 겹치지 않음: 1. **README.md (narrow)** — 사용자 first stop. Quick start / 명령 표 / Mermaid 1개 (논리 아키텍처) / Configuration pointer / 비-목표 / 라이선스. 진척도 / crate 그래프 / 디렉토리 트리 / 핵심 결정 표 모두 빠짐. 2. **HANDOFF.md (신규)** — phase-level 진척 dashboard. Phase status table, component count (33), \"다음 task 후보\" (P9-2/3/4/5, P8 보류), 머지 후 발견된 deviation 짧은 요약 (P3-5/P4-3 --config, P6-2 OCR, P6-3 caption, P7-2 chunk_id, P7-3 storage UNIQUE, P9-1 ratatui generic). 본문 detail 은 tasks/HOTFIXES.md. 3. **docs/ARCHITECTURE.md (신규)** — crate 의존성 그래프, 디렉토리 트리, 핵심 기술 결정 표, 외부 AI 통합 절. README 의 Mermaid 가 여기로 링크. CLAUDE.md 의 \"User-facing docs\" 절 갱신: - 세 문서 audience 분리 명시. - implementation PR 이 셋 다 sync 의무, spec PR 은 안 건드림. - 갱신 trigger 별 (CLI / TUI / Configuration / phase epic / crate 추가 / load-bearing deviation) 어느 문서를 손대는지 매핑. - Out of scope (HOTFIXES detail / version cascade / per-task spec rationale) 어디에도 안 적힘 명시. CLAUDE.md `## Project` 절도 새 문서 layout 반영. 18 crates → ~20 crates. Memory feedback 갱신 (`feedback_readme_sync_rule.md`) — 미래 conversation 에서 자동 적용. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
113 lines
9.0 KiB
Markdown
113 lines
9.0 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||
|
||
## Project
|
||
|
||
Single-user local-first knowledge base + RAG. Rust 2024 workspace, ~20 crates, single binary (`kebab`). All inference is local (Ollama + fastembed + whisper.cpp).
|
||
|
||
The repo's documentation is split by audience — don't duplicate across them:
|
||
|
||
- **[README.md](README.md)** — first stop for an end user. Quick start, command table, one Mermaid logical-architecture diagram, configuration pointers, license. Stays narrow.
|
||
- **[HANDOFF.md](HANDOFF.md)** — phase-level progress dashboard for someone picking the project up. Phase status table, component count, "next task candidates", short summary of post-merge deviations. The README never duplicates this.
|
||
- **[docs/ARCHITECTURE.md](docs/ARCHITECTURE.md)** — internal structure: crate dependency graph, directory tree, locked-in technical decisions. The README links here from the Mermaid diagram.
|
||
- **[docs/superpowers/specs/2026-04-27-kebab-final-form-design.md](docs/superpowers/specs/2026-04-27-kebab-final-form-design.md)** — frozen design contract.
|
||
- **[tasks/INDEX.md](tasks/INDEX.md)** — per-component task tree.
|
||
- **[tasks/HOTFIXES.md](tasks/HOTFIXES.md)** — dated post-merge deviation log; live source of truth where behavior and the frozen spec disagree.
|
||
|
||
## Build / test / lint
|
||
|
||
```bash
|
||
cargo test -p <crate> # preferred — workspace has 18 crates
|
||
cargo test -p <crate> <test_name> # single test (substring match)
|
||
cargo test --workspace --no-fail-fast -j 1 # full suite — see -j 1 below
|
||
cargo clippy --workspace --all-targets -- -D warnings # CI gate
|
||
cargo build --release # produces target/release/kebab
|
||
```
|
||
|
||
`-j 1` for the full workspace test isn't optional: 18 integration-test binaries each link `lance` + `datafusion` + `arrow` + `tantivy` and the parallel link step exhausts memory (linker gets SIGKILL'd, build silently fails partway). Per-crate runs are fine in parallel.
|
||
|
||
`target/` is 6–10 GB after a fresh build (DataFusion + Lance + fastembed + 18 × test-binary debug info). The dev/test profile is already trimmed (`debug = "line-tables-only"`, `split-debuginfo = "unpacked"` — see workspace `Cargo.toml`). Run `cargo clean` after phase merges if disk pressure shows up; backtraces still resolve to function + line.
|
||
|
||
## The facade rule
|
||
|
||
`kebab-app` is the only crate UI binaries (`kebab-cli`, future `kebab-tui`, `kebab-desktop`) may touch. Every user-facing entry has a `*_with_config(cfg, …)` companion that takes an explicit `Config`:
|
||
|
||
- `kebab-cli` calls the `*_with_config` form so `--config <path>` is honored.
|
||
- The bare `kebab_app::ingest(...)` / `search(...)` / `ask(...)` form re-loads `Config::load(None)` (XDG default) and silently bypasses any explicit path. Two regressions of exactly this shape are recorded in `tasks/HOTFIXES.md` (P3-5 + P4-3 follow-ups). When wiring a new CLI subcommand, always thread the `Config` through.
|
||
|
||
`*_with_config` is `#[doc(hidden)] pub fn` but it's the **official** config-explicit API, not a test seam.
|
||
|
||
## Spec contract
|
||
|
||
`docs/superpowers/specs/2026-04-27-kebab-final-form-design.md` (12 sections) is the single contract for the whole workspace. Every component task spec under `tasks/p<N>/` lists which `contract_sections` it implements.
|
||
|
||
- Changing the design doc requires updating every referencing task spec in the same PR.
|
||
- Task specs themselves stay **frozen** as the historical contract once the task is merged. Don't edit them retroactively to match what shipped.
|
||
- Live deviations from the original contract go in `tasks/HOTFIXES.md` as dated entries, plus a one-line cross-link in the original spec's `Risks / notes`. Treat HOTFIXES.md as the live source of truth when behavior and spec disagree.
|
||
|
||
`tasks/INDEX.md` is the dashboard for which phases / components are done; update its phase status when a phase epic completes.
|
||
|
||
## Allowed / forbidden deps
|
||
|
||
Each task spec lists `Allowed dependencies` and `Forbidden dependencies` per design §8. The most load-bearing ones:
|
||
|
||
- `kebab-core` MUST NOT depend on any other `kebab-*` crate. Domain types only.
|
||
- `kebab-eval`'s `metrics` and `compare` modules MUST NOT import retrieval / embedding / LLM crates directly. The runner is allowed to use `kebab-app`'s facade (P5-1 inheritance — see deviations in that task spec).
|
||
- UI crates (`kebab-cli`, future `kebab-tui`, `kebab-desktop`) MUST NOT import `kebab-store-*` / `kebab-llm-*` / `kebab-parse-*` directly — only `kebab-app`.
|
||
|
||
Read the relevant task spec's deps section before adding an import. New crates inherit the same boundary rules.
|
||
|
||
## Wire schema v1
|
||
|
||
All `--json` output carries a `schema_version` field (`ingest_report.v1`, `search_hit.v1`, `answer.v1`, `doctor.v1`, …). Schemas live in `docs/wire-schema/v1/`. The wire shape is the contract for external integrations (Claude Code skills, MCP, etc.); breaking it requires a `*.v2` major bump and parallel-running both for one phase.
|
||
|
||
## Versioning cascade
|
||
|
||
`parser_version` / `chunker_version` / `embedding_version` / `prompt_template_version` / `index_version` follow the cascade rule in design §9. Changing any of these invalidates downstream records (chunks, embeddings, eval runs, …). When changing a version: either ship a re-process job or treat it as a breaking schema bump. The eval runner snapshots all five into `eval_runs.config_snapshot_json`.
|
||
|
||
## Naming + paths
|
||
|
||
- Crate prefix: `kebab-` (kebab-case package, `kebab_` snake_case in Rust modules).
|
||
- Binary: `kebab`.
|
||
- Env var prefix: `KEBAB_*` (e.g. `KEBAB_RAG_SCORE_GATE`, `KEBAB_EVAL_GOLDEN`, `KEBAB_COMMIT_HASH`).
|
||
- XDG paths: `~/.config/kebab/`, `~/.local/share/kebab/`, `~/.cache/kebab/`, `~/.local/state/kebab/`.
|
||
- SQLite filename: `kebab.sqlite` (under `data_dir`).
|
||
- Workspace ignore: `.kebabignore` (per directory).
|
||
|
||
The migration from the old `kb` name lives in commits `911fb49 / f1a448d / f9714aa`. If you spot a leftover `kb` reference, treat it as a leftover and fix it (the rename PR sweep covered crates/, docs/, tasks/, README, design doc, fixtures — but workspace root `Cargo.toml` comments needed a follow-up; assume similar misses are possible).
|
||
|
||
## Smoke + integration
|
||
|
||
`docs/SMOKE.md` walks through running the full pipeline against an isolated TempDir KB via `--config /tmp/kebab-smoke/config.toml`. Use this instead of touching `~/.local/share/kebab/` when verifying a fresh clone or a CLI flag change. Most CLI regressions surface here, not in unit tests (see HOTFIXES.md).
|
||
|
||
## User-facing docs (README + HANDOFF + ARCHITECTURE)
|
||
|
||
Three sibling docs split the audience. Every implementation PR (`feat/*`) keeps them in sync; spec PRs (`spec/*`) don't touch any of the three.
|
||
|
||
**[README.md](README.md) — end user.** Stays narrow. The three surfaces a user touches:
|
||
|
||
- **CLI** — new `kebab <subcommand>`, flag, `--json` field, or exit-code change. Update the **명령** table and the **Quick start** block if the new flow needs a different invocation.
|
||
- **TUI** — new pane, key binding, or run-time behavior visible to a `kebab tui` user. Update the row in the **명령** table and the Mermaid diagram if a new external surface lands.
|
||
- **Configuration** — new `config.toml` field, `KEBAB_*` env, default change, or XDG path. Update the **Configuration** section AND the config example block in `docs/SMOKE.md`.
|
||
|
||
The Mermaid logical-architecture diagram stays the only diagram in the README. If a new media type / external service / store crosses the diagram boundary, update it; otherwise leave it alone.
|
||
|
||
The README does NOT carry: phase status, component count, post-merge deviations, crate dependency graph, directory tree, locked-in technical decisions. Those live in HANDOFF or ARCHITECTURE.
|
||
|
||
**[HANDOFF.md](HANDOFF.md) — handing off.** Phase-level progress + next-task candidates. Flip the relevant phase row from ⏳ to ✅ when a phase epic completes. Add a one-line entry under "머지 후 발견된 버그 / 결정 (요약)" when a HOTFIXES entry lands that's load-bearing for someone picking up the project. Per-component progress lives in `tasks/INDEX.md`, not here.
|
||
|
||
**[docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) — implementation detail.** Crate dependency graph, directory tree, locked-in technical decisions. Update when:
|
||
|
||
- A new crate is added — extend the graph + directory tree.
|
||
- A locked-in decision flips (e.g. OCR engine default changes per a HOTFIXES entry) — update the table and link the HOTFIXES entry.
|
||
- A directory moves — update the tree.
|
||
|
||
Out of scope for all three: HOTFIXES detail (`tasks/HOTFIXES.md`), version cascade mechanics (CLAUDE.md §Versioning cascade), per-task spec rationale (`tasks/p<N>/`).
|
||
|
||
If a feature ships behind a flag that's off-by-default, mention the flag explicitly in the README so a user reading only the README knows the surface exists but is gated.
|
||
|
||
## Remote
|
||
|
||
Git remote is Gitea: `https://gitea.altair823.xyz/altair823-org/kebab.git`. PRs are created via the Gitea REST API (`POST /repos/altair823-org/kebab/pulls`) — `gh` CLI does not work against this host. Auth uses `~/.netrc` (populated via `git credential fill`).
|