Commit Graph

2 Commits

Author SHA1 Message Date
kb
b999a12ab5 tasks: address PR #1 review
- p3-3: SQLite-first/Lance-second + status marker (V003__embedding_status); drop "best-effort 2PC" misnomer
- p4-3: replace print_stream FnMut closure with mpsc::Sender<String> (RagPipeline stays Send+Sync)
- p4-3: tighten citation regex to strict [#<n>] only — reject [n]/prose/code-block false positives
- p5-2: compare_runs across chunker_version is graceful (doc + span overlap fallback) with chunker_version_match audit field; --strict-chunker-version restores refusal
- p7-1: per-page text via lopdf (pdf-extract has no per-page Rust API); use char count for spans
- p8-1: explicit rubato (FftFixedIn) for 16 kHz mono resample; symphonia decode only
- p9-5: drop cmd_read_pdf_page + pdfium native dep; cmd_read_file_bytes + frontend pdfjs; add traversal tests
2026-04-27 13:10:31 +00:00
kb
d96d9cc56c tasks: add P7 component specs (pdf-extractor, pdf-chunker) 2026-04-27 12:07:56 +00:00