|
|
a3390d5171
|
p1-6: scaffold kb-store-sqlite crate + V001 full §5 DDL
New workspace member crate `kb-store-sqlite` (allowed deps only:
kb-core, kb-config, rusqlite[bundled], refinery, serde, serde_json,
time, blake3, tracing, anyhow, thiserror; dev-deps add kb-parse-md /
kb-normalize / kb-chunk for the contract round-trip test).
Migration V001 replaces the P0-1 stub with the full §5 DDL (assets,
documents, document_tags, blocks, chunks with policy_hash,
embedding_records, jobs, ingest_runs, answers, eval_runs,
eval_query_results) plus the §5 indexes. FTS5 virtual table + triggers
remain deferred to V002 (P2-1).
Public surface per task spec:
SqliteStore::open / run_migrations / put_asset_with_bytes
impl DocumentStore for SqliteStore (7 trait methods)
impl JobRepo for SqliteStore (4 trait methods)
StoreError { Sqlx, Migration, Conflict }
Behavior:
- Pragmas at open: foreign_keys=ON, journal_mode=WAL,
synchronous=NORMAL, temp_store=MEMORY.
- Asset writer: byte_len ≤ copy_threshold_mb * 1MiB → copy to
data_dir/assets/<aa>/<asset_id> (mode 0o644 on Unix), else
reference. blake3(bytes) verified against asset.checksum; mismatch →
Conflict.
- Idempotency: put_document UPSERTs and bumps doc_version + 1 on
conflict; put_blocks / put_chunks DELETE-then-INSERT; document_tags
re-derived per put_document.
- get_document rehydrates blocks via payload_json ordered by stream
ordinal.
- list_documents builds dynamic WHERE from DocFilter (lang / trust_min
/ path_glob via GLOB / tags_any via document_tags subquery).
- JobRepo: jobs.kind/status are stored as lowercase enum tags; create
mints a 32-hex JobId via blake3(kind || payload || nanos).
Tests follow in subsequent commits.
|
2026-04-30 17:08:36 +00:00 |
|
|
|
a166b7051c
|
p0-1: wire-schema stubs, doc/spec stubs, V001 migration, fixtures
- docs/wire-schema/v1/ ships 7 schema stubs (citation, search_hit,
answer, ingest_report, doc_summary, chunk_inspection, doctor) that
pin schema_version + required fields per design §2. Full property
validation lands in later phases.
- docs/spec/ ships 7 markdown stubs each linking to the canonical
frozen design (domain-model, ids, canonical-document, chunk-policy,
citation-policy, module-boundaries, ai-generation-guidelines).
- migrations/V001__init.sql contains only schema_meta + migrations
tables per design §5.1; data tables ship in P1-6/P2-1/P3-3.
- fixtures/ has the 11 subdirectories every downstream task references
(markdown, source-fs, search/{lexical,hybrid}, embed, vector, rag,
eval, image, pdf, audio). Empty subdirs use .gitkeep so they track.
fixtures/markdown/ ships the 3 phase-0 fixtures: simple-note.md,
nested-headings.md, code-and-table.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-30 05:17:32 +00:00 |
|