PR #1 review left a design-debt note: ParsedBlock landing in kb-core would (a) force every crate to recompile on parser-internal changes, and (b) cause namespace pollution when P6/P7/P8 parsers add their own variants. Resolution: a new thin crate kb-parse-types sits between kb-core and parsers. Owns ParsedBlock + ParsedPayload + Warning + forward-refs for image/pdf/audio parser intermediates. Depends on kb-core only (for SourceSpan / Inline). Updates: - design §3.7b: add new section defining kb-parse-types - design §8: add kb-parse-types to module-boundary diagram + forbidden list - design §3.4 Inline stays in kb-core; kb-parse-types references it (no duplication) - p0-1 skeleton: workspace + Cargo deps + public surface block - p1-3 parse-md-blocks: outputs Vec<kb_parse_types::ParsedBlock> directly - p1-4 normalize: Allowed gains kb-parse-types, drops cross-coupling note - INDEX + phase-0 epic: list kb-parse-types in P0 deliverables
4.4 KiB
4.4 KiB
title, source, date
| title | source | date |
|---|---|---|
| KB 작업 단위 인덱스 | kb_local_rust_report.md | 2026-04-27 |
KB 작업 단위 인덱스
kb_local_rust_report.md 의 Phase 로드맵을 아키텍처 수준 작업 단위로 분해. 각 task 문서는 독립적으로 착수/검수 가능한 단위.
의존 그래프
P0 ── P1 ── P2 ── P3 ── P4 ── P5
│
├─ P6 (image)
├─ P7 (pdf)
├─ P8 (audio)
└─ P9 (TUI/desktop)
P0P5 는 직렬. P6P9 는 P5 이후 병렬 가능.
작업 단위
| # | 코드 | 제목 | 핵심 산출 crate | 선행 |
|---|---|---|---|---|
| P0 | phase-0-skeleton.md | Workspace 뼈대 + 도메인 계약 | kb-core, kb-parse-types, kb-config, kb-app, kb-cli | – |
| P1 | phase-1-markdown-ingestion.md | Markdown ingestion 파이프라인 | kb-source-fs, kb-parse-md, kb-normalize, kb-chunk, kb-store-sqlite | P0 |
| P2 | phase-2-lexical-search.md | SQLite FTS5 lexical 검색 + citation | kb-search (lexical) | P1 |
| P3 | phase-3-vector-hybrid.md | Local embedding + LanceDB + hybrid | kb-embed, kb-embed-local, kb-store-vector, kb-search | P2 |
| P4 | phase-4-local-llm-rag.md | Local LLM + RAG + grounded answer | kb-llm, kb-llm-local, kb-rag | P3 |
| P5 | phase-5-evaluation.md | Golden query / regression eval | kb-eval | P4 |
| P6 | phase-6-image.md | 이미지 ingestion (OCR + caption) | kb-parse-image | P5 |
| P7 | phase-7-pdf.md | PDF text + page citation | kb-parse-pdf | P5 |
| P8 | phase-8-audio.md | 음성 transcription + timestamp citation | kb-parse-audio | P5 |
| P9 | phase-9-ui.md | TUI + desktop app | kb-tui, kb-desktop | P5 |
Component task decomposition (per phase)
각 phase 의 component-level 분해. AI sub-agent 1세션 = 1 task 가 sweet spot.
- P0 — p0/ — 1 component
- P1 — p1/ — 6 components
- P2 — p2/ — 2 components
- P3 — p3/ — 4 components
- P4 — p4/ — 3 components
- P5 — p5/ — 2 components
- P6 — p6/ — 3 components
- P7 — p7/ — 2 components
- P8 — p8/ — 2 components
- P9 — p9/ — 5 components
모든 task 공통 규약
- 의존성 경계 (
Allowed/Forbidden) 위반 금지. report §19 참조. - citation 없는 검색 결과 / RAG 응답 금지.
- 원본 파일 파괴 금지. 파생물만 재생성.
- 모든 record 에 version (parser/chunker/embedding/index/prompt) 기록.
- 각 phase 완료 =
cargo check --workspace && cargo test --workspace통과 + 해당 phase 의 완료 조건 CLI 데모 통과.