feat(p3-3): kb-store-vector — LanceDB VectorStore + V003 embedding status

First VectorStore implementation. Per-model Lance tables under config.storage.vector_dir, two-phase upsert (SQLite-pending → Lance MergeInsert → SQLite-committed) with crash-safe retry, search via cosine distance with the spec's score-shift (preserves negative similarity ranking signal that clamping would crush). V003 migration: - Adds status (CHECK constraint pending|committed|tombstone, default pending) and vector_committed columns to embedding_records. - BEFORE DELETE trigger on chunks flips dependent rows to tombstone. Currently overshadowed by V001's ON DELETE CASCADE FK; trigger UPDATE runs first then row vanishes via CASCADE. Spec-faithful tombstone preservation requires recreating embedding_records to drop the CASCADE — deferred to a P+ migration since no production rows exist yet (P3-3 is the first writer). V003 SQL comment explains. LanceVectorStore: - ensure_table is idempotent: opens existing or creates with the Arrow schema (chunk_id, doc_id, embedding FixedSizeList<Float32, dim>, model_id, embedding_version, text, heading_path, created_at). - IndexId computed via id_for_index with collection="chunk_embeddings", index_kind="flat", params_hash = blake3(descriptor JSON). Schema bumps automatically rotate the IndexId. - upsert: phase-1 INSERT OR REPLACE INTO embedding_records (status= 'pending') in a single SQLite tx; phase-2 Lance MergeInsert keyed on chunk_id (idempotent re-run); phase-3 UPDATE status='committed', vector_committed=1. If phase-2 fails the rows stay 'pending' and the next upsert call retries idempotently. - search joins embedding_records WHERE status='committed' so partial- write rows never surface. Cosine distance from Lance ∈ [0, 2] → similarity = 1 - distance ∈ [-1, 1] → score = (similarity + 1)/2 ∈ [0, 1]. NaN coerced to 0 with tracing::warn. Filter by SearchFilters via SqliteStore::filter_chunks (added in this commit). - Sync trait + async LanceDB bridged by an embedded current-thread tokio runtime. Doc-comment on the struct flags the "do NOT call from inside another tokio runtime" panic (block_on cannot nest). kb-app's job scheduler is sync today. kb-store-sqlite additions: - pub fn put_embedding_records_pending(&[EmbeddingRecordRow]) — phase-1 INSERT OR REPLACE (status='pending', vector_committed=0). - pub fn mark_embedding_records_committed(&[EmbeddingId]) — phase-3 single UPDATE … WHERE embedding_id IN (?, ?, …) via params_from_iter, guarded by WHERE status='pending' so tombstones don't get clobbered. - pub fn filter_chunks(&[ChunkId], &SearchFilters) → Vec<ChunkId> consolidates the JOIN against documents/document_tags/ embedding_records + path_glob via globset. Lets kb-store-vector honor SearchFilters without depending on rusqlite or globset directly. (kb-search's filter logic is structurally different — interleaved with the FTS5 SELECT — so it stays as-is for now; consolidation is a P+ refactor.) - 4 new unit tests cover the phase-1 round-trip, empty batch, replay reset of pending rows, and the WHERE-status-pending guard. Tests: - 9 lib unit tests in kb-store-vector covering paths/sanitization, arrow_batch dim validation + descriptor hash, bm25-style cosine score shift math. - 4 new kb-store-sqlite unit tests on filter_chunks (committed-only, tags/lang/trust/path_glob, order preservation, empty input). - 4 new kb-store-sqlite unit tests on the embedding_records helpers. - 8 integration tests in upsert_search.rs and 1 snapshot test marked #[ignore = "requires AVX-capable hardware (LanceDB)"]. They invoke require_avx_or_panic() at the top of each body so a missing-AVX --ignored run fails loudly instead of silently passing. This dev host (qemu64 model) lacks AVX so these were NOT exercised end-to- end here — first CI lane on AVX hardware will validate them. - Snapshot fixture tests/fixtures/vector/run-1.json is a placeholder with an _comment marker. Snapshot test panics until the placeholder is replaced via KB_UPDATE_SNAPSHOTS=1 on AVX hardware. - Workspace 241 passed, 19 ignored, 0 failed; cargo clippy --workspace --all-targets -- -D warnings clean. Allowed deps respected (kb-core, kb-config, kb-store-sqlite, lancedb, arrow + arrow-array + arrow-schema, serde, serde_json, tracing, thiserror) plus forced waivers — anyhow (trait return type), tokio + futures (LanceDB async-only API), blake3 (params_hash). rusqlite and globset are NOT direct deps of kb-store-vector — confirmed via cargo metadata --no-deps. rusqlite stays in [dev-dependencies] for the test fixture seeder only. Out of scope: IVF/PQ index tuning (P+), image vectors (P6), kb-app embed_index orchestration (P3-4 facade). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 10:01:31 +00:00
parent 9beef930b4
commit 3cd5117a7e
16 changed files with 6399 additions and 70 deletions
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -9,6 +9,7 @@ members = [
    "crates/kb-normalize",
    "crates/kb-chunk",
    "crates/kb-store-sqlite",
+    "crates/kb-store-vector",
    "crates/kb-search",
    "crates/kb-embed",
    "crates/kb-embed-local",
@@ -43,3 +44,13 @@ proptest     = "1"
 # downloads). Pinned to the 4.x line per task p3-2 (current 5.x release
 # remains untested for this workspace).
 fastembed    = "4.9"
+# LanceDB embedded vector store (P3-3). 0.23.x pulls arrow / arrow-array /
+# arrow-schema 56.x transitively (via lance 1.0); the kb-store-vector
+# crate matches that major to share the same Arrow types without a
+# re-export adapter.
+lancedb      = { version = "0.23", default-features = false }
+arrow        = "56"
+arrow-array  = "56"
+arrow-schema = "56"
+tokio        = { version = "1", features = ["rt", "macros"] }
+futures      = "0.3"
--- a/crates/kb-store-sqlite/Cargo.toml
+++ b/crates/kb-store-sqlite/Cargo.toml
@@ -14,6 +14,11 @@ kb-config    = { path = "../kb-config" }
 # Explicitly NOT `bundled-sqlcipher` per task allowed-deps list.
 rusqlite     = { version = "0.32", features = ["bundled"] }
 refinery     = { version = "0.8", features = ["rusqlite"] }
+# Used by `filter_chunks` for the optional `path_glob` post-filter.
+# The SQL prefilter handles tags / lang / trust / committed-status; the
+# Rust-side glob keeps the SQL surface small (no LIKE-vs-glob impedance
+# mismatch) and matches the pattern kb-search/src/lexical.rs uses.
+globset      = { workspace = true }
 serde_json   = { workspace = true }
 time         = { workspace = true }
 blake3       = { workspace = true }
--- a/crates/kb-store-sqlite/src/embeddings.rs
+++ b/crates/kb-store-sqlite/src/embeddings.rs
@@ -0,0 +1,317 @@
+//! Embedding-records writers used by `kb-store-vector` (P3-3).
+//!
+//! The `VectorStore` impl in `kb-store-vector` performs a two-phase write:
+//! phase 1 stages an `embedding_records` row at `status='pending'` before
+//! issuing the Lance write, and phase 3 promotes those same rows to
+//! `status='committed'` after the Lance commit lands. We surface those
+//! two SQL statements here (rather than expose a generic write
+//! connection) so the SQL stays inside the crate that owns the schema —
+//! kb-store-vector consumes a typed, narrowly-scoped API and never
+//! touches the connection mutex itself.
+//!
+//! Both helpers wrap a single `INSERT OR REPLACE` / `UPDATE` per row
+//! inside a single SQLite transaction, so a partial failure leaves
+//! either all rows pending (phase 1) or all rows committed (phase 3),
+//! never a mixed batch.
+
+use anyhow::{Context, Result};
+use rusqlite::{params, params_from_iter};
+use time::OffsetDateTime;
+use time::format_description::well_known::Rfc3339;
+
+use crate::error::StoreError;
+use crate::store::SqliteStore;
+
+/// Row payload for [`SqliteStore::put_embedding_records_pending`].
+///
+/// Mirrors the columns of `embedding_records` minus the lifecycle markers
+/// (`status` and `vector_committed`) — those are forced to `'pending'`
+/// and `0` by phase 1.
+///
+/// `created_at` is `OffsetDateTime` rather than a pre-formatted string so
+/// the helper owns the RFC3339 formatting (the same formatting choice
+/// the asset / document / job writers make).
+#[derive(Clone, Debug)]
+pub struct EmbeddingRecordRow {
+    pub embedding_id: String,
+    pub chunk_id: String,
+    pub model_id: String,
+    pub model_version: String,
+    pub dimensions: usize,
+    pub lance_table: String,
+    pub created_at: OffsetDateTime,
+}
+
+impl SqliteStore {
+    /// Phase 1 of the kb-store-vector two-phase write: stage every
+    /// `embedding_records` row with `status='pending'`,
+    /// `vector_committed=0`. `INSERT OR REPLACE` (rather than UPSERT) is
+    /// the right shape here because re-running phase 1 for an
+    /// already-pending row resets `vector_committed` to 0 and the
+    /// `created_at` to the new attempt's timestamp — both desired,
+    /// because a retry should look like a fresh attempt to the GC pass.
+    ///
+    /// All rows are written in a single transaction; if any row fails
+    /// the entire batch is rolled back and the caller can retry without
+    /// worrying about partial pending state.
+    pub fn put_embedding_records_pending(
+        &self,
+        rows: &[EmbeddingRecordRow],
+    ) -> Result<()> {
+        if rows.is_empty() {
+            return Ok(());
+        }
+        let mut conn = self.lock_conn();
+        let tx = conn.transaction().map_err(StoreError::from)?;
+        {
+            let mut stmt = tx
+                .prepare(
+                    "INSERT OR REPLACE INTO embedding_records (
+                        embedding_id, chunk_id, model_id, model_version,
+                        dimensions, lance_table, created_at,
+                        status, vector_committed
+                    ) VALUES (?, ?, ?, ?, ?, ?, ?, 'pending', 0)",
+                )
+                .map_err(StoreError::from)?;
+            for row in rows {
+                let created_at = row
+                    .created_at
+                    .format(&Rfc3339)
+                    .context("format embedding_records.created_at")?;
+                stmt.execute(params![
+                    row.embedding_id,
+                    row.chunk_id,
+                    row.model_id,
+                    row.model_version,
+                    row.dimensions as i64,
+                    row.lance_table,
+                    created_at,
+                ])
+                .map_err(StoreError::from)?;
+            }
+        }
+        tx.commit().map_err(StoreError::from)?;
+        Ok(())
+    }
+
+    /// Phase 3 of the kb-store-vector two-phase write: after the Lance
+    /// MergeInsert commits, flip the listed embedding rows to
+    /// `status='committed'`, `vector_committed=1`. Rows that aren't
+    /// currently `pending` (e.g. already committed by a duplicate batch,
+    /// or tombstoned by a chunks DELETE between phase 1 and phase 3)
+    /// are deliberately left alone via `WHERE status='pending'` — we
+    /// never resurrect a tombstone, and we never blindly re-mark a
+    /// committed row.
+    ///
+    /// All updates run in a single statement (single SQL `UPDATE …
+    /// WHERE embedding_id IN (?, ?, …)`) inside one transaction —
+    /// avoids the per-row `execute()` round-trip the previous
+    /// implementation paid.
+    pub fn mark_embedding_records_committed(
+        &self,
+        embedding_ids: &[String],
+    ) -> Result<()> {
+        if embedding_ids.is_empty() {
+            return Ok(());
+        }
+        let mut conn = self.lock_conn();
+        let tx = conn.transaction().map_err(StoreError::from)?;
+        {
+            let placeholders = std::iter::repeat_n("?", embedding_ids.len())
+                .collect::<Vec<_>>()
+                .join(",");
+            let sql = format!(
+                "UPDATE embedding_records
+                    SET status='committed', vector_committed=1
+                  WHERE status='pending'
+                    AND embedding_id IN ({placeholders})"
+            );
+            tx.execute(&sql, params_from_iter(embedding_ids.iter()))
+                .map_err(StoreError::from)?;
+        }
+        tx.commit().map_err(StoreError::from)?;
+        Ok(())
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use kb_config::Config;
+    use tempfile::TempDir;
+    use time::OffsetDateTime;
+
+    /// Minimal config pointing at a tempdir for the SQLite file.
+    fn config_for(tmp: &TempDir) -> Config {
+        let mut c = Config::defaults();
+        c.storage.data_dir = tmp.path().to_string_lossy().into_owned();
+        c
+    }
+
+    /// Seed a chunks row + the doc / asset rows it FKs to. The minimum
+    /// needed for embedding_records inserts not to fail the FK to
+    /// chunks.
+    fn seed_chunk(store: &SqliteStore, chunk_id: &str) {
+        let conn = store.lock_conn();
+        // Asset, document, chunk — all hand-rolled at the SQL layer to
+        // keep the test self-contained (no kb-parse/kb-chunk dep).
+        conn.execute(
+            "INSERT INTO assets (
+                asset_id, source_uri, workspace_path, media_type, byte_len,
+                checksum, storage_kind, storage_path, discovered_at
+             ) VALUES (?, ?, ?, ?, ?, ?, 'reference', '/tmp/x', ?)",
+            params![
+                "0123456789abcdef0123456789abcdef",
+                "file:///tmp/x",
+                "x.md",
+                "{}",
+                0_i64,
+                "deadbeef",
+                "1970-01-01T00:00:00Z",
+            ],
+        )
+        .unwrap();
+        conn.execute(
+            "INSERT INTO documents (
+                doc_id, asset_id, workspace_path, title, lang, source_type,
+                trust_level, parser_version, doc_version, schema_version,
+                metadata_json, provenance_json, created_at, updated_at
+             ) VALUES (?, ?, ?, NULL, NULL, 'fs', 'unverified', 'v1', 1, 1, '{}', '{}', ?, ?)",
+            params![
+                "fedcba9876543210fedcba9876543210",
+                "0123456789abcdef0123456789abcdef",
+                "x.md",
+                "1970-01-01T00:00:00Z",
+                "1970-01-01T00:00:00Z",
+            ],
+        )
+        .unwrap();
+        conn.execute(
+            "INSERT INTO chunks (
+                chunk_id, doc_id, text, heading_path_json, section_label,
+                source_spans_json, token_estimate, chunker_version,
+                policy_hash, block_ids_json, created_at
+             ) VALUES (?, ?, 'hi', '[]', NULL, '[]', 1, 'v1', 'hash', '[]', ?)",
+            params![chunk_id, "fedcba9876543210fedcba9876543210", "1970-01-01T00:00:00Z"],
+        )
+        .unwrap();
+    }
+
+    fn open_store(tmp: &TempDir) -> SqliteStore {
+        let cfg = config_for(tmp);
+        let store = SqliteStore::open(&cfg).unwrap();
+        store.run_migrations().unwrap();
+        store
+    }
+
+    #[test]
+    fn pending_then_committed_round_trip() {
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        let chunk = "11112222333344445555666677778888";
+        seed_chunk(&store, chunk);
+
+        let row = EmbeddingRecordRow {
+            embedding_id: "aaaa1111bbbb2222cccc3333dddd4444".to_string(),
+            chunk_id: chunk.to_string(),
+            model_id: "test-model".to_string(),
+            model_version: "v1".to_string(),
+            dimensions: 4,
+            lance_table: "chunk_embeddings_test_model_4".to_string(),
+            created_at: OffsetDateTime::now_utc(),
+        };
+        store
+            .put_embedding_records_pending(std::slice::from_ref(&row))
+            .unwrap();
+
+        // Inspect: the row exists at status='pending'.
+        {
+            let conn = store.read_conn();
+            let (status, committed): (String, i64) = conn
+                .query_row(
+                    "SELECT status, vector_committed FROM embedding_records WHERE embedding_id = ?",
+                    params![row.embedding_id],
+                    |r| Ok((r.get(0)?, r.get(1)?)),
+                )
+                .unwrap();
+            assert_eq!(status, "pending");
+            assert_eq!(committed, 0);
+        }
+
+        store
+            .mark_embedding_records_committed(std::slice::from_ref(&row.embedding_id))
+            .unwrap();
+        {
+            let conn = store.read_conn();
+            let (status, committed): (String, i64) = conn
+                .query_row(
+                    "SELECT status, vector_committed FROM embedding_records WHERE embedding_id = ?",
+                    params![row.embedding_id],
+                    |r| Ok((r.get(0)?, r.get(1)?)),
+                )
+                .unwrap();
+            assert_eq!(status, "committed");
+            assert_eq!(committed, 1);
+        }
+    }
+
+    #[test]
+    fn empty_batches_are_noops() {
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        store.put_embedding_records_pending(&[]).unwrap();
+        store.mark_embedding_records_committed(&[]).unwrap();
+    }
+
+    #[test]
+    fn replay_phase_one_resets_vector_committed() {
+        // INSERT OR REPLACE: a phase-1 retry on a row that briefly
+        // reached `committed` (in some adversarial out-of-order replay)
+        // resets it to `pending`. Confirms the documented semantics.
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        let chunk = "11112222333344445555666677778888";
+        seed_chunk(&store, chunk);
+
+        let row = EmbeddingRecordRow {
+            embedding_id: "aaaa1111bbbb2222cccc3333dddd4444".to_string(),
+            chunk_id: chunk.to_string(),
+            model_id: "test-model".to_string(),
+            model_version: "v1".to_string(),
+            dimensions: 4,
+            lance_table: "chunk_embeddings_test_model_4".to_string(),
+            created_at: OffsetDateTime::now_utc(),
+        };
+        store
+            .put_embedding_records_pending(std::slice::from_ref(&row))
+            .unwrap();
+        store
+            .mark_embedding_records_committed(std::slice::from_ref(&row.embedding_id))
+            .unwrap();
+        store
+            .put_embedding_records_pending(std::slice::from_ref(&row))
+            .unwrap();
+
+        let conn = store.read_conn();
+        let status: String = conn
+            .query_row(
+                "SELECT status FROM embedding_records WHERE embedding_id = ?",
+                params![row.embedding_id],
+                |r| r.get(0),
+            )
+            .unwrap();
+        assert_eq!(status, "pending");
+    }
+
+    #[test]
+    fn mark_committed_skips_non_pending() {
+        // The phase-3 UPDATE explicitly filters `status='pending'`, so
+        // calling it on an embedding_id that was never staged (or that
+        // already became a tombstone) is a no-op rather than an error.
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        store
+            .mark_embedding_records_committed(&["does-not-exist".to_string()])
+            .unwrap();
+    }
+}
--- a/crates/kb-store-sqlite/src/filters.rs
+++ b/crates/kb-store-sqlite/src/filters.rs
@@ -0,0 +1,452 @@
+//! Chunk-level filter helpers shared between retrievers.
+//!
+//! `kb-store-vector::search` post-filters its Lance candidate set
+//! against the SQLite-side metadata (committed-status / lang / tags /
+//! trust / path_glob). Rather than open a private SQL surface in
+//! `kb-store-vector`, the JOIN logic lives here so:
+//!
+//! - The schema (and CHECK / FK invariants) stays owned by the crate
+//!   that ships the migrations.
+//! - `kb-store-vector` doesn't need its own `rusqlite` / `globset`
+//!   direct deps — both are forbidden by the P3-3 spec's allowed-dep
+//!   list.
+//! - Future retrievers (e.g. a hybrid blender) can reuse the same
+//!   helper without re-deriving the SQL.
+//!
+//! `kb-search::lexical` already has a similar `tags / lang / trust /
+//! path_glob` filter pass for FTS5 results; we deliberately do *not*
+//! refactor that one in this PR — its SQL is interleaved with the
+//! `bm25 + snippet()` SELECT, so sharing would force an awkward
+//! trait split. P3-3 spec line 27 only mandates the move for
+//! `kb-store-vector`'s usage.
+
+use std::collections::{HashMap, HashSet};
+
+use anyhow::{Context, Result};
+use rusqlite::{params_from_iter, ToSql};
+
+use crate::store::SqliteStore;
+
+impl SqliteStore {
+    /// Filter `chunk_ids` down to those whose owning document passes
+    /// `filters` AND whose embedding row is at `status='committed'`.
+    ///
+    /// The result preserves the input order so the caller can feed it
+    /// back to a Lance distance-asc result list and `take(k)` directly.
+    ///
+    /// `filters` semantics mirror `kb_core::SearchFilters`:
+    ///
+    /// - `tags_any`: doc must own at least one of the listed tags
+    ///   (empty vec ⇒ no tag constraint).
+    /// - `lang`: exact match against `documents.lang`.
+    /// - `trust_min`: doc trust ≥ the supplied level (Generated <
+    ///   Secondary < Primary, mirroring `list_documents` and
+    ///   `kb-search::lexical`).
+    /// - `path_glob`: shell-style glob (`*` does **not** cross `/`)
+    ///   against `documents.workspace_path`. Compiled in Rust via
+    ///   `globset` rather than translated to SQLite GLOB so the
+    ///   semantics match `kb-search::lexical` exactly.
+    ///
+    /// The `embedding_records.status='committed'` predicate is always
+    /// applied: tombstoned and pending rows must never surface to
+    /// search callers (spec §5.6).
+    pub fn filter_chunks(
+        &self,
+        chunk_ids: &[kb_core::ChunkId],
+        filters: &kb_core::SearchFilters,
+    ) -> Result<Vec<kb_core::ChunkId>> {
+        if chunk_ids.is_empty() {
+            return Ok(Vec::new());
+        }
+
+        // Deduplicate the IN-list so a pathological caller passing
+        // `[c1, c1, c1]` doesn't blow the SQL placeholder count.
+        let unique_ids: Vec<String> = {
+            let mut seen = HashSet::new();
+            chunk_ids
+                .iter()
+                .filter_map(|c| {
+                    if seen.insert(c.0.as_str()) {
+                        Some(c.0.clone())
+                    } else {
+                        None
+                    }
+                })
+                .collect()
+        };
+
+        let placeholders = std::iter::repeat_n("?", unique_ids.len())
+            .collect::<Vec<_>>()
+            .join(",");
+        let mut sql = format!(
+            "SELECT er.chunk_id, d.workspace_path
+               FROM embedding_records er
+               JOIN chunks c    ON c.chunk_id = er.chunk_id
+               JOIN documents d ON d.doc_id  = c.doc_id
+              WHERE er.status = 'committed'
+                AND er.chunk_id IN ({placeholders})"
+        );
+
+        let mut bind: Vec<Box<dyn ToSql>> = unique_ids
+            .iter()
+            .map(|s| {
+                let b: Box<dyn ToSql> = Box::new(s.clone());
+                b
+            })
+            .collect();
+
+        if let Some(lang) = &filters.lang {
+            sql.push_str(" AND d.lang = ?");
+            bind.push(Box::new(lang.0.clone()));
+        }
+        if let Some(min) = &filters.trust_min {
+            // Mirror `list_documents` / `kb-search::lexical`: rank
+            // Generated=1 < Secondary=2 < Primary=3.
+            sql.push_str(
+                " AND CASE d.trust_level
+                       WHEN 'primary'   THEN 3
+                       WHEN 'secondary' THEN 2
+                       WHEN 'generated' THEN 1
+                       ELSE 0 END >= ?",
+            );
+            let rank: i64 = match min {
+                kb_core::TrustLevel::Primary => 3,
+                kb_core::TrustLevel::Secondary => 2,
+                kb_core::TrustLevel::Generated => 1,
+            };
+            bind.push(Box::new(rank));
+        }
+        if !filters.tags_any.is_empty() {
+            let tag_ph = std::iter::repeat_n("?", filters.tags_any.len())
+                .collect::<Vec<_>>()
+                .join(",");
+            sql.push_str(&format!(
+                " AND EXISTS (SELECT 1 FROM document_tags t \
+                   WHERE t.doc_id = d.doc_id AND t.tag IN ({tag_ph}))"
+            ));
+            for tag in &filters.tags_any {
+                bind.push(Box::new(tag.clone()));
+            }
+        }
+
+        // Optional path_glob: applied in Rust on the rows we get back,
+        // not in SQL — matching `kb-search::lexical`'s post-filter so
+        // the glob semantics are byte-identical between retrievers.
+        let path_matcher = match filters.path_glob.as_deref() {
+            Some(pat) => Some(
+                globset::GlobBuilder::new(pat)
+                    .literal_separator(true)
+                    .build()
+                    .with_context(|| {
+                        format!("kb-store-sqlite::filter_chunks: invalid path_glob {pat:?}")
+                    })?
+                    .compile_matcher(),
+            ),
+            None => None,
+        };
+
+        let conn = self.read_conn();
+        let mut stmt = conn
+            .prepare(&sql)
+            .context("kb-store-sqlite::filter_chunks: prepare SQL")?;
+        let rows = stmt
+            .query_map(
+                params_from_iter(bind.iter().map(|b| b.as_ref())),
+                |row| {
+                    let chunk_id: String = row.get(0)?;
+                    let workspace_path: String = row.get(1)?;
+                    Ok((chunk_id, workspace_path))
+                },
+            )
+            .context("kb-store-sqlite::filter_chunks: execute SQL")?;
+
+        let mut allowed: HashMap<String, String> = HashMap::new();
+        for r in rows {
+            let (chunk_id, workspace_path) =
+                r.context("kb-store-sqlite::filter_chunks: read row")?;
+            allowed.insert(chunk_id, workspace_path);
+        }
+
+        let mut out = Vec::with_capacity(chunk_ids.len());
+        for cand in chunk_ids {
+            let workspace_path = match allowed.get(&cand.0) {
+                Some(p) => p,
+                None => continue,
+            };
+            if let Some(m) = &path_matcher {
+                if !m.is_match(workspace_path) {
+                    continue;
+                }
+            }
+            out.push(cand.clone());
+        }
+        Ok(out)
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use kb_config::Config;
+    use kb_core::{ChunkId, Lang, SearchFilters, TrustLevel};
+    use rusqlite::params;
+    use tempfile::TempDir;
+    use time::OffsetDateTime;
+
+    use crate::EmbeddingRecordRow;
+
+    fn open_store(tmp: &TempDir) -> SqliteStore {
+        let mut c = Config::defaults();
+        c.storage.data_dir = tmp.path().to_string_lossy().into_owned();
+        let store = SqliteStore::open(&c).unwrap();
+        store.run_migrations().unwrap();
+        store
+    }
+
+    /// Seed (asset, document, document_tags, chunk) rows + a
+    /// committed embedding_records row for a single chunk_id. Mirrors
+    /// the shape `kb-store-vector` builds in production.
+    fn seed_committed(
+        store: &SqliteStore,
+        chunk_id: &str,
+        doc_id: &str,
+        workspace_path: &str,
+        lang: &str,
+        tags: &[&str],
+        trust: &str,
+    ) {
+        let asset_id = format!("a{}", &doc_id[..31]);
+        {
+            let conn = store.lock_conn();
+            conn.execute(
+                "INSERT INTO assets (
+                    asset_id, source_uri, workspace_path, media_type, byte_len,
+                    checksum, storage_kind, storage_path, discovered_at
+                 ) VALUES (?, ?, ?, '{}', 0, 'deadbeefdeadbeefdeadbeefdeadbeef',
+                           'reference', ?, '1970-01-01T00:00:00Z')",
+                params![
+                    asset_id,
+                    format!("file://{workspace_path}"),
+                    workspace_path,
+                    workspace_path,
+                ],
+            )
+            .unwrap();
+            conn.execute(
+                "INSERT INTO documents (
+                    doc_id, asset_id, workspace_path, title, lang, source_type,
+                    trust_level, parser_version, doc_version, schema_version,
+                    metadata_json, provenance_json, created_at, updated_at
+                 ) VALUES (?, ?, ?, NULL, ?, 'markdown', ?, 'v1', 1, 1,
+                           '{}', '{}', '1970-01-01T00:00:00Z', '1970-01-01T00:00:00Z')",
+                params![doc_id, asset_id, workspace_path, lang, trust],
+            )
+            .unwrap();
+            for t in tags {
+                conn.execute(
+                    "INSERT INTO document_tags (doc_id, tag) VALUES (?, ?)",
+                    params![doc_id, t],
+                )
+                .unwrap();
+            }
+            conn.execute(
+                "INSERT INTO chunks (
+                    chunk_id, doc_id, text, heading_path_json, section_label,
+                    source_spans_json, token_estimate, chunker_version,
+                    policy_hash, block_ids_json, created_at
+                 ) VALUES (?, ?, 'hi', '[]', NULL, '[]', 1, 'v1', 'h', '[]',
+                           '1970-01-01T00:00:00Z')",
+                params![chunk_id, doc_id],
+            )
+            .unwrap();
+        }
+
+        let embed_row = EmbeddingRecordRow {
+            embedding_id: format!("e{}", &chunk_id[..31]),
+            chunk_id: chunk_id.to_string(),
+            model_id: "m".to_string(),
+            model_version: "v1".to_string(),
+            dimensions: 4,
+            lance_table: "t".to_string(),
+            created_at: OffsetDateTime::UNIX_EPOCH,
+        };
+        store
+            .put_embedding_records_pending(std::slice::from_ref(&embed_row))
+            .unwrap();
+        store
+            .mark_embedding_records_committed(std::slice::from_ref(
+                &embed_row.embedding_id,
+            ))
+            .unwrap();
+    }
+
+    fn cid(s: &str) -> ChunkId {
+        ChunkId(s.to_string())
+    }
+
+    #[test]
+    fn filter_chunks_drops_uncommitted_rows() {
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        let c1 = "11111111111111111111111111111111";
+        let c2 = "22222222222222222222222222222222";
+        let d1 = "d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1";
+        let d2 = "d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2";
+        seed_committed(&store, c1, d1, "a.md", "en", &[], "primary");
+
+        // c2: chunk + doc but no committed embedding row.
+        let asset_id = format!("a{}", &d2[..31]);
+        let conn = store.lock_conn();
+        conn.execute(
+            "INSERT INTO assets (
+                asset_id, source_uri, workspace_path, media_type, byte_len,
+                checksum, storage_kind, storage_path, discovered_at
+             ) VALUES (?, 'file://b.md', 'b.md', '{}', 0,
+                       'deadbeefdeadbeefdeadbeefdeadbeef',
+                       'reference', 'b.md', '1970-01-01T00:00:00Z')",
+            params![asset_id],
+        )
+        .unwrap();
+        conn.execute(
+            "INSERT INTO documents (
+                doc_id, asset_id, workspace_path, title, lang, source_type,
+                trust_level, parser_version, doc_version, schema_version,
+                metadata_json, provenance_json, created_at, updated_at
+             ) VALUES (?, ?, 'b.md', NULL, 'en', 'markdown', 'primary', 'v1',
+                       1, 1, '{}', '{}',
+                       '1970-01-01T00:00:00Z', '1970-01-01T00:00:00Z')",
+            params![d2, asset_id],
+        )
+        .unwrap();
+        conn.execute(
+            "INSERT INTO chunks (
+                chunk_id, doc_id, text, heading_path_json, section_label,
+                source_spans_json, token_estimate, chunker_version,
+                policy_hash, block_ids_json, created_at
+             ) VALUES (?, ?, 'hi', '[]', NULL, '[]', 1, 'v1', 'h', '[]',
+                       '1970-01-01T00:00:00Z')",
+            params![c2, d2],
+        )
+        .unwrap();
+        drop(conn);
+
+        let out = store
+            .filter_chunks(&[cid(c1), cid(c2)], &SearchFilters::default())
+            .unwrap();
+        assert_eq!(out, vec![cid(c1)]);
+    }
+
+    #[test]
+    fn filter_chunks_tags_any_lang_trust_path_glob() {
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        // c1: tags=[ko-style], lang=en, primary, notes/a.md
+        // c2: tags=[other],    lang=en, primary, notes/b.md
+        // c3: tags=[ko-style], lang=ko, secondary, notes/c.md
+        // c4: tags=[ko-style], lang=en, generated, src/d.md
+        let chunks = [
+            ("11111111111111111111111111111111", "d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1", "notes/a.md", "en", "primary",   &["ko-style"][..]),
+            ("22222222222222222222222222222222", "d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2", "notes/b.md", "en", "primary",   &["other"][..]),
+            ("33333333333333333333333333333333", "d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3", "notes/c.md", "ko", "secondary", &["ko-style"][..]),
+            ("44444444444444444444444444444444", "d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4d4", "src/d.md",   "en", "generated", &["ko-style"][..]),
+        ];
+        for (c, d, p, l, t, tags) in &chunks {
+            seed_committed(&store, c, d, p, l, tags, t);
+        }
+
+        // tags_any=[ko-style] → c1, c3, c4 (drop c2).
+        let f = SearchFilters {
+            tags_any: vec!["ko-style".to_string()],
+            ..Default::default()
+        };
+        let out = store
+            .filter_chunks(
+                &chunks.iter().map(|c| cid(c.0)).collect::<Vec<_>>(),
+                &f,
+            )
+            .unwrap();
+        let mut got: Vec<&str> = out.iter().map(|c| c.0.as_str()).collect();
+        got.sort();
+        assert_eq!(got, vec![chunks[0].0, chunks[2].0, chunks[3].0]);
+
+        // + lang=en  → drops c3.
+        let f = SearchFilters {
+            tags_any: vec!["ko-style".to_string()],
+            lang: Some(Lang("en".to_string())),
+            ..Default::default()
+        };
+        let out = store
+            .filter_chunks(
+                &chunks.iter().map(|c| cid(c.0)).collect::<Vec<_>>(),
+                &f,
+            )
+            .unwrap();
+        let mut got: Vec<&str> = out.iter().map(|c| c.0.as_str()).collect();
+        got.sort();
+        assert_eq!(got, vec![chunks[0].0, chunks[3].0]);
+
+        // + trust_min=Secondary → drops c4 (generated < secondary).
+        let f = SearchFilters {
+            tags_any: vec!["ko-style".to_string()],
+            lang: Some(Lang("en".to_string())),
+            trust_min: Some(TrustLevel::Secondary),
+            ..Default::default()
+        };
+        let out = store
+            .filter_chunks(
+                &chunks.iter().map(|c| cid(c.0)).collect::<Vec<_>>(),
+                &f,
+            )
+            .unwrap();
+        let got: Vec<&str> = out.iter().map(|c| c.0.as_str()).collect();
+        assert_eq!(got, vec![chunks[0].0]);
+
+        // path_glob = "notes/*.md" with no other constraint → c1, c2, c3.
+        let f = SearchFilters {
+            path_glob: Some("notes/*.md".to_string()),
+            ..Default::default()
+        };
+        let out = store
+            .filter_chunks(
+                &chunks.iter().map(|c| cid(c.0)).collect::<Vec<_>>(),
+                &f,
+            )
+            .unwrap();
+        let mut got: Vec<&str> = out.iter().map(|c| c.0.as_str()).collect();
+        got.sort();
+        assert_eq!(got, vec![chunks[0].0, chunks[1].0, chunks[2].0]);
+    }
+
+    #[test]
+    fn filter_chunks_preserves_input_order_and_dedupes() {
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        let c1 = "11111111111111111111111111111111";
+        let c2 = "22222222222222222222222222222222";
+        let c3 = "33333333333333333333333333333333";
+        seed_committed(&store, c1, "d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1", "a.md", "en", &[], "primary");
+        seed_committed(&store, c2, "d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2", "b.md", "en", &[], "primary");
+        seed_committed(&store, c3, "d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3", "c.md", "en", &[], "primary");
+
+        // Ask in the order c3, c1, c2; result must preserve that order.
+        let out = store
+            .filter_chunks(&[cid(c3), cid(c1), cid(c2)], &SearchFilters::default())
+            .unwrap();
+        assert_eq!(out, vec![cid(c3), cid(c1), cid(c2)]);
+
+        // Duplicates in the input survive in the output (dedup is for
+        // the SQL IN-list only — caller may want repeats for ranking).
+        let out = store
+            .filter_chunks(&[cid(c1), cid(c1), cid(c2)], &SearchFilters::default())
+            .unwrap();
+        assert_eq!(out, vec![cid(c1), cid(c1), cid(c2)]);
+    }
+
+    #[test]
+    fn filter_chunks_empty_input_short_circuits() {
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        let out = store.filter_chunks(&[], &SearchFilters::default()).unwrap();
+        assert!(out.is_empty());
+    }
+}
--- a/crates/kb-store-sqlite/src/lib.rs
+++ b/crates/kb-store-sqlite/src/lib.rs
@@ -8,18 +8,25 @@
 //!
 //! Allowed deps per task spec: `kb-core`, `kb-config`, `rusqlite`,
 //! `refinery`, `serde_json`, `time`, `blake3`, `tracing`, `anyhow`,
-//! `thiserror`. NOT allowed: `kb-parse-*`, `kb-normalize`, `kb-chunk`,
-//! `kb-store-vector`, `kb-source-fs`, etc. (`kb-parse-md`, `kb-normalize`,
-//! `kb-chunk` may appear as **dev-deps** — see `Cargo.toml` — to drive
-//! the contract round-trip test off a real Markdown fixture.)
+//! `thiserror`. `globset` was added in P3-3 to back the
+//! `filter_chunks` helper (used by `kb-store-vector`'s post-filter
+//! pass — moving the SQL JOIN into this crate kept `kb-store-vector`
+//! from needing its own `rusqlite` / `globset` direct deps). NOT
+//! allowed: `kb-parse-*`, `kb-normalize`, `kb-chunk`, `kb-store-vector`,
+//! `kb-source-fs`, etc. (`kb-parse-md`, `kb-normalize`, `kb-chunk` may
+//! appear as **dev-deps** — see `Cargo.toml` — to drive the contract
+//! round-trip test off a real Markdown fixture.)

 mod documents;
+mod embeddings;
 mod error;
+mod filters;
 mod fts;
 mod jobs;
 mod schema;
 mod store;

+pub use embeddings::EmbeddingRecordRow;
 pub use error::StoreError;
 pub use fts::rebuild_chunks_fts;
 pub use store::SqliteStore;
--- a/crates/kb-store-vector/Cargo.toml
+++ b/crates/kb-store-vector/Cargo.toml
@@ -0,0 +1,55 @@
+[package]
+name          = "kb-store-vector"
+version       = { workspace = true }
+edition       = { workspace = true }
+rust-version  = { workspace = true }
+license       = { workspace = true }
+repository    = { workspace = true }
+description   = "LanceDB-backed VectorStore for kb (§5.6 embedding_records, §6.3 lancedb tables, §7.2 VectorStore)"
+
+[dependencies]
+kb-core         = { path = "../kb-core" }
+kb-config       = { path = "../kb-config" }
+# kb-store-sqlite is allowed for the embedding_records writers only
+# (P3-3 spec: "Allowed dep `kb-store-sqlite` for writing/reading rows in
+# embedding_records"). The Two-phase upsert flow uses
+# `put_embedding_records_pending` + `mark_embedding_records_committed`.
+kb-store-sqlite = { path = "../kb-store-sqlite" }
+
+# LanceDB embedded vector store. `default-features=false` opts out of
+# the cloud object-store integrations (aws / gcs / azure / dynamodb /
+# oss); kb is always-local for v1, so dragging in those SDKs would just
+# inflate the build.
+lancedb         = { workspace = true }
+arrow           = { workspace = true }
+arrow-array     = { workspace = true }
+arrow-schema    = { workspace = true }
+# Embedded async runtime. The VectorStore trait is sync (§7.2) but
+# LanceDB's Rust API is async-only; we own a current-thread
+# tokio::Runtime and `block_on` per trait method. current-thread saves
+# the two worker threads a multi-thread runtime would spawn — kb-app
+# already serializes vector ops behind its own job scheduler so the
+# extra parallelism wouldn't be exploited.
+tokio           = { workspace = true }
+# `try_collect` for streaming Lance query results into a Vec<RecordBatch>.
+futures         = { workspace = true }
+
+serde           = { workspace = true }
+serde_json      = { workspace = true }
+tracing         = { workspace = true }
+thiserror       = { workspace = true }
+anyhow          = { workspace = true }
+blake3          = { workspace = true }
+time            = { workspace = true }
+
+[dev-dependencies]
+tempfile        = { workspace = true }
+serde_json      = { workspace = true }
+# Integration tests seed `documents` / `chunks` fixtures by raw SQL
+# (no kb-parse-md / kb-normalize / kb-chunk dep) so they can construct
+# adversarial filter / dim-mismatch states. rusqlite is a `[dev-]`
+# dep only — the runtime crate uses kb-store-sqlite's typed surface
+# (`filter_chunks`, `put_embedding_records_pending`, …) and does not
+# touch rusqlite directly (P3-3 spec: kb-store-vector must not list
+# rusqlite/globset as direct deps).
+rusqlite        = { workspace = true }
--- a/crates/kb-store-vector/src/arrow_batch.rs
+++ b/crates/kb-store-vector/src/arrow_batch.rs
@@ -0,0 +1,232 @@
+//! Arrow schema + RecordBatch builder for the per-model Lance table.
+//!
+//! Per design §6.3 the per-row layout is:
+//!
+//! ```text
+//! chunk_id          : Utf8 (primary)
+//! doc_id            : Utf8
+//! embedding         : FixedSizeList<Float32, dim>
+//! model_id          : Utf8
+//! embedding_version : Utf8
+//! text              : Utf8
+//! heading_path      : Utf8 (JSON-encoded Vec<String>)
+//! created_at        : Timestamp(Microsecond, UTC)
+//! ```
+//!
+//! `heading_path` is encoded as a JSON string rather than a Lance
+//! `List<Utf8>` to keep the `only_if` SQL filter surface clean — Lance
+//! exposes scalar columns to its query DSL trivially, but list columns
+//! need `array_contains`-style helpers that aren't required by the
+//! current `SearchFilters` shape.
+
+use std::sync::Arc;
+
+use anyhow::{Context, Result};
+use arrow_array::{
+    ArrayRef, FixedSizeListArray, Float32Array, RecordBatch, StringArray,
+    TimestampMicrosecondArray,
+};
+use arrow_schema::{DataType, Field, Schema, SchemaRef, TimeUnit};
+use kb_core::VectorRecord;
+use time::OffsetDateTime;
+
+/// Arrow schema for a Lance table whose vector column is FixedSizeList
+/// of `dim` Float32. All non-vector columns are non-nullable; the
+/// vector column itself is non-nullable but the inner Float32 slot is
+/// nullable per Arrow convention (Lance ignores the inner-nullable
+/// flag when the outer field is non-null).
+pub(crate) fn schema_for(dim: usize) -> SchemaRef {
+    Arc::new(Schema::new(vec![
+        Field::new("chunk_id", DataType::Utf8, false),
+        Field::new("doc_id", DataType::Utf8, false),
+        Field::new(
+            "embedding",
+            DataType::FixedSizeList(
+                Arc::new(Field::new("item", DataType::Float32, true)),
+                dim as i32,
+            ),
+            false,
+        ),
+        Field::new("model_id", DataType::Utf8, false),
+        Field::new("embedding_version", DataType::Utf8, false),
+        Field::new("text", DataType::Utf8, false),
+        Field::new("heading_path", DataType::Utf8, false),
+        Field::new(
+            "created_at",
+            DataType::Timestamp(TimeUnit::Microsecond, Some("UTC".into())),
+            false,
+        ),
+    ]))
+}
+
+/// Build a `RecordBatch` from `recs`. All records must share `dim`;
+/// callers are expected to pre-bucket per-table batches before reaching
+/// here. The batch carries `recs.len()` rows; `now` is folded into
+/// `created_at` for every row to match design §6.3.
+pub(crate) fn build_batch(
+    recs: &[VectorRecord],
+    dim: usize,
+    now: OffsetDateTime,
+) -> Result<RecordBatch> {
+    let schema = schema_for(dim);
+
+    let chunk_ids = StringArray::from(
+        recs.iter().map(|r| r.chunk_id.0.as_str()).collect::<Vec<_>>(),
+    );
+    let doc_ids = StringArray::from(
+        recs.iter().map(|r| r.doc_id.0.as_str()).collect::<Vec<_>>(),
+    );
+    let model_ids = StringArray::from(
+        recs.iter().map(|r| r.model_id.0.as_str()).collect::<Vec<_>>(),
+    );
+    let model_versions = StringArray::from(
+        recs.iter()
+            .map(|r| r.model_version.0.as_str())
+            .collect::<Vec<_>>(),
+    );
+    let texts =
+        StringArray::from(recs.iter().map(|r| r.text.as_str()).collect::<Vec<_>>());
+
+    // heading_path: serde_json::Value::Array of strings, then to_string.
+    let heading_paths: Vec<String> = recs
+        .iter()
+        .map(|r| serde_json::to_string(&r.heading_path))
+        .collect::<std::result::Result<_, _>>()
+        .context("serialize heading_path JSON")?;
+    let heading_path_arr = StringArray::from(
+        heading_paths.iter().map(String::as_str).collect::<Vec<_>>(),
+    );
+
+    // Embedding: FixedSizeList<Float32, dim>. Build from the flat
+    // contiguous f32 buffer.
+    let mut flat: Vec<f32> = Vec::with_capacity(recs.len() * dim);
+    for r in recs {
+        if r.vector.len() != dim {
+            anyhow::bail!(
+                "vector length {} does not match table dim {} for chunk {}",
+                r.vector.len(),
+                dim,
+                r.chunk_id.0
+            );
+        }
+        flat.extend_from_slice(&r.vector);
+    }
+    let values = Float32Array::from(flat);
+    let embedding_field =
+        Arc::new(Field::new("item", DataType::Float32, true));
+    let embedding = FixedSizeListArray::try_new(
+        embedding_field,
+        dim as i32,
+        Arc::new(values),
+        None,
+    )
+    .context("build FixedSizeList embedding column")?;
+
+    // created_at: microseconds since Unix epoch, UTC.
+    let micros: Vec<i64> = std::iter::repeat_n(
+        (now.unix_timestamp_nanos() / 1_000) as i64,
+        recs.len(),
+    )
+    .collect();
+    let created_at = TimestampMicrosecondArray::from(micros).with_timezone("UTC");
+
+    let arrays: Vec<ArrayRef> = vec![
+        Arc::new(chunk_ids) as ArrayRef,
+        Arc::new(doc_ids),
+        Arc::new(embedding),
+        Arc::new(model_ids),
+        Arc::new(model_versions),
+        Arc::new(texts),
+        Arc::new(heading_path_arr),
+        Arc::new(created_at),
+    ];
+
+    RecordBatch::try_new(schema, arrays).context("assemble RecordBatch")
+}
+
+/// blake3-hex of the canonical JSON of the schema. Used as
+/// `params_hash` for `id_for_index` so the `IndexId` stays stable
+/// across invocations with the same `dim`.
+pub(crate) fn schema_params_hash(dim: usize) -> String {
+    // Keep the hash input shape self-describing so a future schema
+    // tweak (extra column, type change, …) bumps the hash and produces
+    // a different `IndexId` automatically.
+    let descriptor = serde_json::json!({
+        "version": 1,
+        "dim": dim,
+        "columns": [
+            {"name": "chunk_id", "type": "Utf8"},
+            {"name": "doc_id", "type": "Utf8"},
+            {"name": "embedding", "type": "FixedSizeList<Float32>", "size": dim},
+            {"name": "model_id", "type": "Utf8"},
+            {"name": "embedding_version", "type": "Utf8"},
+            {"name": "text", "type": "Utf8"},
+            {"name": "heading_path", "type": "Utf8"},
+            {"name": "created_at", "type": "Timestamp<us, UTC>"},
+        ],
+    });
+    let bytes = descriptor_bytes(&descriptor);
+    blake3::hash(&bytes).to_hex().to_string()
+}
+
+/// Serialize the schema descriptor to bytes for hashing. Plain
+/// `serde_json::to_vec` rather than a canonical-JSON crate is fine
+/// here because the descriptor is built from a fixed `serde_json::json!`
+/// literal in `schema_params_hash` — `serde_json` walks the object's
+/// key order deterministically (insertion order, since `Value::Object`
+/// uses `Map`), so the byte output is stable across runs without a
+/// canonicalizer. The empty-vec fallback on the (unreachable, given
+/// our literal input) error path keeps the function infallible.
+fn descriptor_bytes(v: &serde_json::Value) -> Vec<u8> {
+    serde_json::to_vec(v).unwrap_or_default()
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use kb_core::{ChunkId, DocumentId, EmbeddingId, EmbeddingModelId, EmbeddingVersion};
+    use time::OffsetDateTime;
+
+    fn make_rec(chunk_idx: u8, dim: usize) -> VectorRecord {
+        VectorRecord {
+            chunk_id: ChunkId(format!("{:032x}", chunk_idx)),
+            embedding_id: EmbeddingId(format!("{:032x}", 0xeeeeu16 + chunk_idx as u16)),
+            vector: vec![0.1_f32; dim],
+            doc_id: DocumentId("aaaa".repeat(8)),
+            text: format!("text-{chunk_idx}"),
+            heading_path: vec!["A".to_string(), "B".to_string()],
+            model_id: EmbeddingModelId("test".to_string()),
+            model_version: EmbeddingVersion("v1".to_string()),
+            dimensions: dim,
+        }
+    }
+
+    #[test]
+    fn build_batch_round_trip_basic() {
+        let recs = vec![make_rec(1, 4), make_rec(2, 4)];
+        let batch = build_batch(&recs, 4, OffsetDateTime::UNIX_EPOCH).unwrap();
+        assert_eq!(batch.num_rows(), 2);
+        assert_eq!(batch.num_columns(), 8);
+        let schema = batch.schema();
+        assert_eq!(schema.field(0).name(), "chunk_id");
+        assert_eq!(schema.field(2).name(), "embedding");
+    }
+
+    #[test]
+    fn build_batch_dim_mismatch_errors() {
+        let mut rec = make_rec(1, 4);
+        rec.vector = vec![0.0_f32; 3];
+        let err = build_batch(&[rec], 4, OffsetDateTime::UNIX_EPOCH).unwrap_err();
+        let msg = format!("{err}");
+        assert!(msg.contains("does not match table dim"), "msg={msg}");
+    }
+
+    #[test]
+    fn schema_params_hash_is_stable_for_dim() {
+        let h1 = schema_params_hash(384);
+        let h2 = schema_params_hash(384);
+        assert_eq!(h1, h2);
+        let h3 = schema_params_hash(512);
+        assert_ne!(h1, h3);
+    }
+}
--- a/crates/kb-store-vector/src/lib.rs
+++ b/crates/kb-store-vector/src/lib.rs
@@ -0,0 +1,31 @@
+//! `kb-store-vector` — LanceDB-backed [`kb_core::VectorStore`] for kb.
+//!
+//! Stores per-model Lance tables under `config.storage.vector_dir/`
+//! (`chunk_embeddings_<model>_<dim>.lance/`). `upsert` runs the
+//! SQLite-first / Lance-second two-phase write described in design
+//! §5.6: phase 1 stages `embedding_records` rows at `status='pending'`,
+//! phase 2 issues a Lance `MergeInsert` keyed on `chunk_id`, phase 3
+//! flips the rows to `status='committed'`. `search` joins against
+//! `embedding_records WHERE status='committed'` so partial-write Lance
+//! rows never surface to callers; if the process crashes between phase
+//! 2 and phase 3 (or phase 2 itself fails), the next `upsert` call
+//! retries the still-pending rows idempotently because Lance MergeInsert
+//! dedupes on `chunk_id`.
+//!
+//! Sync / async bridge: `VectorStore` is a sync trait (§7.2) and
+//! LanceDB's Rust API is async-only. We own a private current-thread
+//! `tokio::runtime::Runtime` and `block_on` per trait method. The
+//! tradeoff is documented inline; multi-thread runtime would let two
+//! upserts run concurrently but kb-app's job scheduler already
+//! serializes vector ops, and current-thread saves the two worker
+//! threads a multi-thread runtime spawns by default.
+//!
+//! See `docs/superpowers/specs/2026-04-27-kb-final-form-design.md`
+//! §5.6 (embedding_records DDL), §6.3 (lancedb table naming),
+//! §7.2 (VectorStore), §9 (versioning).
+
+mod arrow_batch;
+mod paths;
+mod store;
+
+pub use store::LanceVectorStore;
--- a/crates/kb-store-vector/src/paths.rs
+++ b/crates/kb-store-vector/src/paths.rs
@@ -0,0 +1,119 @@
+//! Path expansion + table-name sanitization.
+//!
+//! Mirrors `kb-store-sqlite::store::expand_data_dir` and
+//! `kb-embed-local::expand_path` so the three crates resolve
+//! `${XDG_DATA_HOME:-…}` / leading `~` / `{data_dir}` identically. A
+//! shared helper would live in `kb-config`, but the task spec forbids
+//! adding new types to `kb-config`, so we keep a private clone.
+
+use std::path::PathBuf;
+
+/// Expand `{data_dir}` → `data_dir`, `${XDG_DATA_HOME:-…}` → env or
+/// default, leading `~` → `$HOME`. Pass an empty `data_dir` when
+/// resolving `data_dir` itself (the `{data_dir}` substitution is a
+/// no-op in that case).
+pub(crate) fn expand_path(raw: &str, data_dir: &str) -> PathBuf {
+    let mut s = raw.to_string();
+
+    if !data_dir.is_empty() {
+        s = s.replace("{data_dir}", data_dir);
+    }
+
+    // ${XDG_DATA_HOME:-~/.local/share}: env override, else default after `:-`.
+    if let Some(start) = s.find("${XDG_DATA_HOME") {
+        if let Some(rel_end) = s[start..].find('}') {
+            let end = start + rel_end + 1;
+            let inner = &s[start + 2..end - 1];
+            let replacement = match std::env::var("XDG_DATA_HOME") {
+                Ok(v) if !v.is_empty() => v,
+                _ => {
+                    if let Some((_, default)) = inner.split_once(":-") {
+                        default.to_string()
+                    } else {
+                        String::new()
+                    }
+                }
+            };
+            s.replace_range(start..end, &replacement);
+        }
+    }
+
+    if let Some(rest) = s.strip_prefix('~') {
+        if let Some(home) = std::env::var_os("HOME").map(PathBuf::from) {
+            return home.join(rest.trim_start_matches('/'));
+        }
+    }
+
+    PathBuf::from(s)
+}
+
+/// Build the per-model Lance table name. Per design §6.3:
+/// `chunk_embeddings_<model>_<dim>.lance`. Model IDs may contain
+/// characters that are illegal in directory names on some filesystems
+/// (Windows reserved chars, `/`, …) — squash anything outside
+/// `[A-Za-z0-9-]` to `_` so the name is portable.
+///
+/// LanceDB's `connect(uri).open_table(name)` resolves `name` against
+/// the connection root; the trailing `.lance` is part of the directory
+/// LanceDB itself appends when it materializes the table, so we pass
+/// the bare logical name (`chunk_embeddings_<model>_<dim>`) and let
+/// Lance manage the suffix. Spec text uses the suffixed form for the
+/// on-disk path; both are present.
+pub(crate) fn lance_table_name(model_id: &str, dim: usize) -> String {
+    let sanitized = sanitize_model_id(model_id);
+    format!("chunk_embeddings_{sanitized}_{dim}")
+}
+
+/// Replace anything outside `[A-Za-z0-9-]` with `_`. Idempotent.
+pub(crate) fn sanitize_model_id(model_id: &str) -> String {
+    model_id
+        .chars()
+        .map(|c| {
+            if c.is_ascii_alphanumeric() || c == '-' {
+                c
+            } else {
+                '_'
+            }
+        })
+        .collect()
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn sanitize_replaces_path_separators() {
+        assert_eq!(sanitize_model_id("BAAI/bge-small-en"), "BAAI_bge-small-en");
+    }
+
+    #[test]
+    fn sanitize_keeps_dash_and_alpha_num() {
+        assert_eq!(sanitize_model_id("e5-small-v2"), "e5-small-v2");
+    }
+
+    #[test]
+    fn sanitize_squashes_dot_and_colon() {
+        assert_eq!(sanitize_model_id("model.v1:fast"), "model_v1_fast");
+    }
+
+    #[test]
+    fn lance_table_name_format() {
+        assert_eq!(
+            lance_table_name("BAAI/bge-small-en", 384),
+            "chunk_embeddings_BAAI_bge-small-en_384"
+        );
+    }
+
+    #[test]
+    fn expand_path_substitutes_data_dir() {
+        let p = expand_path("{data_dir}/lancedb", "/tmp/kbtest");
+        assert_eq!(p, PathBuf::from("/tmp/kbtest/lancedb"));
+    }
+
+    #[test]
+    fn expand_path_passthrough_absolute() {
+        let p = expand_path("/abs/dir", "/ignored");
+        assert_eq!(p, PathBuf::from("/abs/dir"));
+    }
+}
--- a/crates/kb-store-vector/src/store.rs
+++ b/crates/kb-store-vector/src/store.rs
@@ -0,0 +1,551 @@
+//! `LanceVectorStore` — `kb_core::VectorStore` impl over LanceDB.
+//!
+//! See module-level docs in `lib.rs` for the high-level shape (two-phase
+//! upsert, sync/async bridge, table layout).
+
+use std::collections::HashSet;
+use std::path::PathBuf;
+use std::sync::Arc;
+
+use anyhow::{Context, Result};
+use arrow_array::{Array, Float32Array, RecordBatch, StringArray};
+use arrow_schema::SchemaRef;
+use futures::TryStreamExt;
+use kb_core::{
+    ChunkId, DocumentId, EmbeddingModelId, IndexId, SearchFilters,
+    VectorHit, VectorRecord, VectorStore,
+};
+use kb_store_sqlite::{EmbeddingRecordRow, SqliteStore};
+use lancedb::Connection;
+use lancedb::query::{ExecutableQuery, QueryBase};
+use serde_json::json;
+use time::OffsetDateTime;
+use tokio::runtime::{Builder as RuntimeBuilder, Runtime};
+
+use crate::arrow_batch::{build_batch, schema_for, schema_params_hash};
+use crate::paths::{expand_path, lance_table_name};
+
+/// Overfetch multiplier: when post-filtering Lance results against
+/// SQLite-side filters we ask for `2 * k` candidates so a moderately
+/// selective filter still returns `k` hits. P3-3 spec line 138 caps
+/// the doubling at this multiplier; deeper retries are out of scope.
+const OVERFETCH_MULTIPLIER: usize = 2;
+
+/// `IndexId` collection label per design §4.2.
+const INDEX_COLLECTION: &str = "chunk_embeddings";
+
+/// `IndexId` kind label — flat cosine for v1 (§7.2 + spec line 85).
+const INDEX_KIND: &str = "flat";
+
+/// `IndexVersion` token. The schema doesn't expose IndexVersion as a
+/// dimension we vary per call, but `id_for_index` requires one; pin to
+/// `v1` so re-runs produce stable IDs.
+const INDEX_VERSION: &str = "v1";
+
+/// Lance VectorStore.
+///
+/// Holds a single `lancedb::Connection` opened against
+/// `config.storage.vector_dir/`. The connection is cheap to clone via
+/// `Arc` internally and is reused across `ensure_table` / `upsert` /
+/// `search`. The `tokio::Runtime` is current-thread; multi-thread
+/// would buy concurrency we don't currently exploit (kb-app job
+/// scheduler serializes vector ops) at the cost of two worker
+/// threads.
+///
+/// # Async context
+///
+/// `LanceVectorStore` owns a private `tokio::runtime::Runtime` and
+/// drives every `VectorStore` trait method through `runtime.block_on`.
+/// **Do NOT construct or call any of these methods from inside another
+/// tokio runtime context** — `block_on` panics with `"Cannot start a
+/// runtime from within a runtime"` in that case. `kb-app`'s job
+/// scheduler is synchronous so this is safe today; if a future caller
+/// wants to embed `LanceVectorStore` inside an async server they must
+/// wrap calls in `tokio::task::spawn_blocking` (or move to an
+/// async-native `VectorStore` impl).
+pub struct LanceVectorStore {
+    runtime: Runtime,
+    connection: Connection,
+    sqlite: Arc<SqliteStore>,
+    /// Resolved absolute path to the Lance root. Kept for diagnostics
+    /// only — the `Connection` already knows it.
+    #[allow(dead_code)]
+    vector_dir: PathBuf,
+}
+
+impl LanceVectorStore {
+    /// Open (or create) the Lance directory under
+    /// `config.storage.vector_dir`, build a current-thread tokio
+    /// runtime, and return a ready-to-use store. Migrations on the
+    /// SQLite side must already have been applied (`run_migrations`)
+    /// — this constructor does not touch the SQLite schema.
+    ///
+    /// **Caveat:** internally calls `runtime.block_on` to open the
+    /// Lance connection. Calling this from inside another tokio
+    /// runtime context will panic with `"Cannot start a runtime from
+    /// within a runtime"`. See the struct-level `# Async context`
+    /// section.
+    pub fn new(config: &kb_config::Config, sqlite: Arc<SqliteStore>) -> Result<Self> {
+        let data_dir = expand_path(&config.storage.data_dir, "");
+        let vector_dir =
+            expand_path(&config.storage.vector_dir, &data_dir.to_string_lossy());
+        std::fs::create_dir_all(&vector_dir)
+            .with_context(|| format!("create vector_dir {}", vector_dir.display()))?;
+
+        // current-thread runtime: see module docs. Multi-thread would
+        // spawn two worker threads we don't use.
+        let runtime = RuntimeBuilder::new_current_thread()
+            .enable_all()
+            .build()
+            .context("build tokio runtime for kb-store-vector")?;
+
+        let uri = vector_dir.to_string_lossy().into_owned();
+        let connection = runtime
+            .block_on(async {
+                lancedb::connect(&uri)
+                    .execute()
+                    .await
+                    .context("lancedb::connect")
+            })?;
+
+        tracing::debug!(
+            target: "kb-store-vector",
+            vector_dir = %vector_dir.display(),
+            "opened LanceVectorStore"
+        );
+
+        Ok(Self {
+            runtime,
+            connection,
+            sqlite,
+            vector_dir,
+        })
+    }
+
+    /// Open or create the Lance table with the current schema. Returns
+    /// a handle the caller can use for queries.
+    async fn ensure_table_async(
+        connection: &Connection,
+        table_name: &str,
+        dim: usize,
+    ) -> Result<lancedb::Table> {
+        match connection.open_table(table_name).execute().await {
+            Ok(t) => Ok(t),
+            Err(lancedb::Error::TableNotFound { .. }) => {
+                let schema = schema_for(dim);
+                let table = connection
+                    .create_empty_table(table_name, schema)
+                    .execute()
+                    .await
+                    .context("create_empty_table")?;
+                tracing::info!(
+                    target: "kb-store-vector",
+                    table = table_name,
+                    dim,
+                    "created Lance table"
+                );
+                Ok(table)
+            }
+            Err(e) => Err(anyhow::Error::from(e)).context("open_table"),
+        }
+    }
+
+    /// Validate that the on-disk Lance table's schema matches what
+    /// `schema_for(dim)` produces. Used by `upsert` to fail fast on a
+    /// dim mismatch BEFORE any phase-1 SQLite write lands.
+    fn check_dim(table_schema: &SchemaRef, dim: usize) -> Result<()> {
+        let field = table_schema
+            .field_with_name("embedding")
+            .context("table missing 'embedding' column")?;
+        match field.data_type() {
+            arrow_schema::DataType::FixedSizeList(_, table_dim) => {
+                if (*table_dim as usize) != dim {
+                    anyhow::bail!(
+                        "dimension mismatch: table has dim {}, records have dim {}",
+                        table_dim,
+                        dim
+                    );
+                }
+                Ok(())
+            }
+            other => anyhow::bail!(
+                "embedding column has unexpected Arrow type {:?}",
+                other
+            ),
+        }
+    }
+}
+
+impl VectorStore for LanceVectorStore {
+    fn ensure_table(
+        &self,
+        model: &EmbeddingModelId,
+        dim: usize,
+    ) -> Result<IndexId> {
+        let table_name = lance_table_name(&model.0, dim);
+        // The trait method only needs the IndexId — we don't return the
+        // Lance handle. Open (or create) the table to enforce idempotence
+        // (a second call with the same params must succeed and yield
+        // the same IndexId).
+        self.runtime.block_on(async {
+            Self::ensure_table_async(&self.connection, &table_name, dim).await
+        })?;
+
+        let params_hash = schema_params_hash(dim);
+        let id = kb_core::id_for_index(
+            INDEX_COLLECTION,
+            model,
+            dim,
+            &kb_core::IndexVersion(INDEX_VERSION.to_string()),
+            INDEX_KIND,
+            &params_hash,
+        );
+        Ok(id)
+    }
+
+    fn upsert(&self, recs: &[VectorRecord]) -> Result<()> {
+        if recs.is_empty() {
+            return Ok(());
+        }
+
+        // All records in a single upsert call must share (model_id,
+        // model_version, dimensions). Callers (kb-app indexer) already
+        // batch by model; we enforce here so a misuse fails loudly.
+        let model_id = recs[0].model_id.clone();
+        let model_version = recs[0].model_version.clone();
+        let dim = recs[0].dimensions;
+        for r in recs {
+            if r.model_id != model_id
+                || r.model_version != model_version
+                || r.dimensions != dim
+            {
+                anyhow::bail!(
+                    "kb-store-vector::upsert called with mixed (model_id, model_version, dim) — caller must bucket per table"
+                );
+            }
+        }
+
+        let table_name = lance_table_name(&model_id.0, dim);
+
+        // Open (or create) the Lance table FIRST and check its on-disk
+        // dim against what the records claim. A mismatch must error
+        // before any phase-1 SQLite write — spec line 94: "Dimension
+        // mismatch returns Error from upsert and writes nothing."
+        let table = self.runtime.block_on(async {
+            Self::ensure_table_async(&self.connection, &table_name, dim).await
+        })?;
+        let table_schema = self
+            .runtime
+            .block_on(async { table.schema().await.context("read table schema") })?;
+        Self::check_dim(&table_schema, dim)?;
+
+        // Phase 1: stage embedding_records rows at status='pending'.
+        let now = OffsetDateTime::now_utc();
+        let pending_rows: Vec<EmbeddingRecordRow> = recs
+            .iter()
+            .map(|r| EmbeddingRecordRow {
+                embedding_id: r.embedding_id.0.clone(),
+                chunk_id: r.chunk_id.0.clone(),
+                model_id: r.model_id.0.clone(),
+                model_version: r.model_version.0.clone(),
+                dimensions: r.dimensions,
+                lance_table: table_name.clone(),
+                created_at: now,
+            })
+            .collect();
+        self.sqlite
+            .put_embedding_records_pending(&pending_rows)
+            .context("phase 1: stage pending embedding_records")?;
+
+        // Phase 2: Lance MergeInsert keyed on chunk_id.
+        let batch = build_batch(recs, dim, now)?;
+        merge_insert_batch(&self.runtime, &table, batch)
+            .context("phase 2: Lance MergeInsert")?;
+
+        // Phase 3: flip rows to status='committed'. If we crashed
+        // between phase 2 and phase 3, the rows stay 'pending' and a
+        // future upsert call retries them (Lance MergeInsert dedupes
+        // on chunk_id, so the retry is a no-op on the Lance side).
+        let embedding_ids: Vec<String> =
+            recs.iter().map(|r| r.embedding_id.0.clone()).collect();
+        self.sqlite
+            .mark_embedding_records_committed(&embedding_ids)
+            .context("phase 3: mark embedding_records committed")?;
+
+        tracing::info!(
+            target: "kb-store-vector",
+            table = %table_name,
+            rows = recs.len(),
+            "upsert committed"
+        );
+        Ok(())
+    }
+
+    fn search(
+        &self,
+        query_vec: &[f32],
+        k: usize,
+        filters: &SearchFilters,
+    ) -> Result<Vec<VectorHit>> {
+        if k == 0 {
+            return Ok(Vec::new());
+        }
+
+        // We need to know which table to query. SearchFilters doesn't
+        // carry a model_id (the trait doesn't expose one to the
+        // caller), so we scan known tables on disk and pick the one
+        // matching `query_vec.len()`. In v1 there's typically one
+        // model in play; if there are several we pick the first match.
+        let dim = query_vec.len();
+        let table_name = match self
+            .runtime
+            .block_on(async { find_matching_table(&self.connection, dim).await })?
+        {
+            Some(name) => name,
+            None => {
+                tracing::debug!(
+                    target: "kb-store-vector",
+                    dim,
+                    "search: no Lance table matches query dim — returning empty"
+                );
+                return Ok(Vec::new());
+            }
+        };
+
+        // Pre-fetch 2*k Lance rows; we'll filter against SQLite
+        // afterwards. If filters are empty we still over-fetch to
+        // exclude tombstoned / pending rows.
+        let overfetch = k.saturating_mul(OVERFETCH_MULTIPLIER).max(k);
+        let raw_hits = self.runtime.block_on(async {
+            let table = match self.connection.open_table(&table_name).execute().await
+            {
+                Ok(t) => t,
+                Err(lancedb::Error::TableNotFound { .. }) => return Ok(Vec::new()),
+                Err(e) => return Err(anyhow::Error::from(e)),
+            };
+
+            let stream = table
+                .vector_search(query_vec)
+                .context("vector_search")?
+                .distance_type(lancedb::DistanceType::Cosine)
+                .limit(overfetch)
+                .execute()
+                .await
+                .context("execute vector query")?;
+            let batches: Vec<RecordBatch> =
+                stream.try_collect().await.context("collect batches")?;
+            Result::<Vec<RecordBatch>>::Ok(batches)
+        })?;
+
+        let candidates = decode_lance_hits(&raw_hits)?;
+
+        // Filter against embedding_records (status='committed') and
+        // documents (tags / lang / path / trust). For the empty filter
+        // case the join still excludes tombstoned / pending rows.
+        // The `filter_chunks` helper lives in kb-store-sqlite (the
+        // crate that owns the schema), so this crate doesn't need its
+        // own rusqlite / globset direct deps.
+        let candidate_ids: Vec<ChunkId> = {
+            // Deduplicate — Lance result batches can in principle
+            // repeat a chunk_id across batches; the JOIN is most
+            // efficient if we ask once per id.
+            let mut seen = HashSet::new();
+            candidates
+                .iter()
+                .filter(|c| seen.insert(c.chunk_id.0.clone()))
+                .map(|c| c.chunk_id.clone())
+                .collect()
+        };
+        let allowed_set: HashSet<String> = self
+            .sqlite
+            .filter_chunks(&candidate_ids, filters)
+            .context("post-filter chunks via kb-store-sqlite")?
+            .into_iter()
+            .map(|c| c.0)
+            .collect();
+
+        let mut hits: Vec<VectorHit> = candidates
+            .into_iter()
+            .filter(|c| allowed_set.contains(&c.chunk_id.0))
+            .take(k)
+            .map(LanceCandidate::into_hit)
+            .collect();
+        // Re-rank by score desc to give callers a consistent ordering
+        // regardless of post-filter shuffling.
+        hits.sort_by(|a, b| {
+            b.score
+                .partial_cmp(&a.score)
+                .unwrap_or(std::cmp::Ordering::Equal)
+        });
+        Ok(hits)
+    }
+}
+
+/// One Lance row decoded from a query batch, paired with the converted
+/// score and pre-built JSON payload. We keep `chunk_id` separately so
+/// the SQLite filter pass can JOIN against it without re-parsing the
+/// payload.
+struct LanceCandidate {
+    chunk_id: ChunkId,
+    doc_id: DocumentId,
+    text: String,
+    heading_path: Vec<String>,
+    score: f32,
+}
+
+impl LanceCandidate {
+    fn into_hit(self) -> VectorHit {
+        let payload = json!({
+            "doc_id": self.doc_id.0,
+            "text": self.text,
+            "heading_path": self.heading_path,
+        });
+        VectorHit {
+            chunk_id: self.chunk_id,
+            score: self.score,
+            payload,
+        }
+    }
+}
+
+/// Decode a list of Lance result batches into typed candidates.
+/// Lance's vector query attaches a `_distance: Float32` column; we
+/// convert to similarity via `1 - distance` then shift to `[0, 1]`
+/// via `(sim + 1) / 2` per spec line 96. NaN distances get score 0
+/// (with a warn log).
+fn decode_lance_hits(batches: &[RecordBatch]) -> Result<Vec<LanceCandidate>> {
+    let mut out = Vec::new();
+    for batch in batches {
+        let chunk_ids = batch
+            .column_by_name("chunk_id")
+            .context("missing chunk_id col")?
+            .as_any()
+            .downcast_ref::<StringArray>()
+            .context("chunk_id wrong type")?;
+        let doc_ids = batch
+            .column_by_name("doc_id")
+            .context("missing doc_id col")?
+            .as_any()
+            .downcast_ref::<StringArray>()
+            .context("doc_id wrong type")?;
+        let texts = batch
+            .column_by_name("text")
+            .context("missing text col")?
+            .as_any()
+            .downcast_ref::<StringArray>()
+            .context("text wrong type")?;
+        let heading_path_str = batch
+            .column_by_name("heading_path")
+            .context("missing heading_path col")?
+            .as_any()
+            .downcast_ref::<StringArray>()
+            .context("heading_path wrong type")?;
+        let distances = batch
+            .column_by_name("_distance")
+            .context("missing _distance col")?
+            .as_any()
+            .downcast_ref::<Float32Array>()
+            .context("_distance wrong type")?;
+
+        for i in 0..batch.num_rows() {
+            let dist = distances.value(i);
+            let score = score_from_distance(dist);
+            let heading_path: Vec<String> = serde_json::from_str(
+                heading_path_str.value(i),
+            )
+            .unwrap_or_default();
+            out.push(LanceCandidate {
+                chunk_id: ChunkId(chunk_ids.value(i).to_string()),
+                doc_id: DocumentId(doc_ids.value(i).to_string()),
+                text: texts.value(i).to_string(),
+                heading_path,
+                score,
+            });
+        }
+    }
+    Ok(out)
+}
+
+/// Convert a cosine distance (LanceDB returns `1 - cosine_similarity`
+/// in `[0, 2]` for L2-normalized vectors) to a `[0, 1]` score via
+/// `score = ((1 - distance) + 1) / 2`. Per spec line 96 the shift
+/// (rather than clamp) preserves ordering between unrelated and
+/// opposite vectors. NaN — which Lance can produce when one side is
+/// the all-zero vector — collapses to 0 with a warn.
+fn score_from_distance(distance: f32) -> f32 {
+    if distance.is_nan() {
+        tracing::warn!(
+            target: "kb-store-vector",
+            "NaN cosine distance from Lance — coercing to score 0"
+        );
+        return 0.0;
+    }
+    let sim = 1.0 - distance;
+    (sim + 1.0) / 2.0
+}
+
+/// Find a Lance table whose embedding column is FixedSizeList<Float32, dim>.
+async fn find_matching_table(
+    connection: &Connection,
+    dim: usize,
+) -> Result<Option<String>> {
+    let names = connection
+        .table_names()
+        .execute()
+        .await
+        .context("table_names")?;
+    for name in names {
+        if !name.starts_with("chunk_embeddings_") {
+            continue;
+        }
+        match connection.open_table(&name).execute().await {
+            Ok(t) => {
+                let schema = t.schema().await.context("schema for table")?;
+                if let Ok(field) = schema.field_with_name("embedding") {
+                    if let arrow_schema::DataType::FixedSizeList(_, table_dim) =
+                        field.data_type()
+                    {
+                        if (*table_dim as usize) == dim {
+                            return Ok(Some(name));
+                        }
+                    }
+                }
+            }
+            Err(e) => {
+                tracing::warn!(
+                    target: "kb-store-vector",
+                    table = %name,
+                    error = %e,
+                    "search: skipped unopenable table"
+                );
+            }
+        }
+    }
+    Ok(None)
+}
+
+/// Run the Lance MergeInsert under our embedded runtime. Pulled out
+/// of `upsert` so the trait method stays compact.
+fn merge_insert_batch(
+    runtime: &Runtime,
+    table: &lancedb::Table,
+    batch: RecordBatch,
+) -> Result<()> {
+    let schema = batch.schema();
+    runtime.block_on(async move {
+        let reader = arrow_array::RecordBatchIterator::new(
+            vec![Ok(batch)].into_iter(),
+            schema,
+        );
+        let mut builder = table.merge_insert(&["chunk_id"]);
+        builder
+            .when_matched_update_all(None)
+            .when_not_matched_insert_all();
+        builder
+            .execute(Box::new(reader))
+            .await
+            .context("MergeInsert execute")?;
+        Result::<()>::Ok(())
+    })
+}
+
--- a/crates/kb-store-vector/tests/common/mod.rs
+++ b/crates/kb-store-vector/tests/common/mod.rs
@@ -0,0 +1,185 @@
+//! Shared scaffolding for kb-store-vector integration tests.
+//!
+//! # Test policy
+//!
+//! Integration tests in this crate are marked `#[ignore]` and require
+//! AVX-capable hardware. They are excluded from the default `cargo
+//! test -p kb-store-vector` lane and only run when explicitly opted
+//! in:
+//!
+//! ```text
+//! cargo test -p kb-store-vector -- --ignored
+//! ```
+//!
+//! The reason: LanceDB's f32 SIMD path uses unconditional AVX
+//! intrinsics (`__m256` in `lance-linalg::simd::f32`). On x86_64
+//! CPUs without AVX support — notably QEMU's default `qemu64` model
+//! in CI sandboxes and some bare-metal dev boxes — those instructions
+//! trigger `SIGILL: illegal instruction` at the first `vector_search`
+//! call. Rather than silently turn that into a "passing" test (which
+//! it isn't), we gate the integration suite behind `#[ignore]` and
+//! call [`require_avx_or_panic`] inside each test body so that an
+//! `--ignored` invocation on a non-AVX host fails loudly rather than
+//! crashing later inside a Lance kernel.
+//!
+//! This mirrors P3-2's `#[ignore]` policy on tests that require a
+//! model download — both are CI-lane decisions, not silent skips.
+//!
+//! Each test owns a `TempDir` (vector_dir + sqlite db live underneath
+//! it), a fully-migrated `SqliteStore`, and a `LanceVectorStore`
+//! pointed at both. We seed `documents` / `chunks` rows directly via
+//! SQL (rather than going through `DocumentStore::put_document`) so
+//! the tests stay independent of kb-parse-md / kb-normalize / kb-chunk
+//! and so we can construct adversarial fixtures (filtered tags,
+//! mismatched langs) without reproducing a Markdown round-trip.
+
+#![allow(dead_code)]
+
+use std::path::PathBuf;
+use std::sync::Arc;
+
+/// Panic if the host CPU lacks AVX. Called from every `#[ignore]`-d
+/// integration test body so that `cargo test -- --ignored` on a
+/// non-AVX host fails loudly with a clear message instead of crashing
+/// later inside a Lance SIMD kernel with `SIGILL`.
+///
+/// On non-x86_64 hosts this is a no-op (Lance's AVX requirement is
+/// x86-only — ARM/Apple Silicon paths use different intrinsics that
+/// the workspace doesn't currently target).
+pub fn require_avx_or_panic() {
+    #[cfg(target_arch = "x86_64")]
+    {
+        if !std::is_x86_feature_detected!("avx") {
+            panic!(
+                "kb-store-vector integration test requires AVX-capable hardware; \
+                 host CPU lacks AVX. Run on an AVX-capable machine. \
+                 See crates/kb-store-vector/tests/common/mod.rs."
+            );
+        }
+    }
+}
+
+use kb_config::Config;
+use kb_core::{
+    ChunkId, DocumentId, EmbeddingId, EmbeddingModelId, EmbeddingVersion, VectorRecord,
+};
+use kb_store_sqlite::SqliteStore;
+use kb_store_vector::LanceVectorStore;
+use rusqlite::params;
+use tempfile::TempDir;
+
+pub struct TestEnv {
+    pub temp: TempDir,
+    pub config: Config,
+    pub sqlite: Arc<SqliteStore>,
+    pub vector: LanceVectorStore,
+}
+
+impl TestEnv {
+    pub fn new() -> Self {
+        let temp = tempfile::tempdir().expect("tempdir");
+        let mut config = Config::defaults();
+        config.storage.data_dir = temp.path().to_string_lossy().into_owned();
+        let sqlite = SqliteStore::open(&config).unwrap();
+        sqlite.run_migrations().unwrap();
+        let sqlite = Arc::new(sqlite);
+        let vector = LanceVectorStore::new(&config, sqlite.clone()).unwrap();
+        Self {
+            temp,
+            config,
+            sqlite,
+            vector,
+        }
+    }
+
+    pub fn data_dir(&self) -> PathBuf {
+        self.temp.path().to_path_buf()
+    }
+
+    /// Insert minimum (asset, document, chunk) rows so phase-1
+    /// embedding_records inserts don't trip the FK to chunks /
+    /// documents.
+    pub fn seed_chunk(
+        &self,
+        chunk_id: &str,
+        doc_id: &str,
+        workspace_path: &str,
+        lang: &str,
+        tags: &[&str],
+        trust_level: &str,
+    ) {
+        // Asset id derived from doc_id deterministically — every
+        // chunk gets its own asset to keep things simple.
+        let asset_id = format!("a{}", &doc_id[..31]);
+        let conn = self.sqlite.read_conn();
+        conn.execute(
+            "INSERT OR IGNORE INTO assets (
+                asset_id, source_uri, workspace_path, media_type, byte_len,
+                checksum, storage_kind, storage_path, discovered_at
+             ) VALUES (?, ?, ?, ?, 0, ?, 'reference', ?, '1970-01-01T00:00:00Z')",
+            params![
+                asset_id,
+                format!("file://{workspace_path}"),
+                workspace_path,
+                "{}",
+                "deadbeefdeadbeefdeadbeefdeadbeef",
+                workspace_path,
+            ],
+        )
+        .unwrap();
+        conn.execute(
+            "INSERT OR IGNORE INTO documents (
+                doc_id, asset_id, workspace_path, title, lang, source_type,
+                trust_level, parser_version, doc_version, schema_version,
+                metadata_json, provenance_json, created_at, updated_at
+             ) VALUES (?, ?, ?, NULL, ?, 'markdown', ?, 'v1', 1, 1, '{}', '{}',
+                       '1970-01-01T00:00:00Z', '1970-01-01T00:00:00Z')",
+            params![doc_id, asset_id, workspace_path, lang, trust_level],
+        )
+        .unwrap();
+        for t in tags {
+            conn.execute(
+                "INSERT OR IGNORE INTO document_tags (doc_id, tag) VALUES (?, ?)",
+                params![doc_id, t],
+            )
+            .unwrap();
+        }
+        conn.execute(
+            "INSERT OR IGNORE INTO chunks (
+                chunk_id, doc_id, text, heading_path_json, section_label,
+                source_spans_json, token_estimate, chunker_version,
+                policy_hash, block_ids_json, created_at
+             ) VALUES (?, ?, 'hi', '[]', NULL, '[]', 1, 'v1', 'h', '[]', '1970-01-01T00:00:00Z')",
+            params![chunk_id, doc_id],
+        )
+        .unwrap();
+    }
+}
+
+/// Build a deterministic test VectorRecord from a few simple inputs.
+/// `vector` is taken verbatim, `dimensions` is set from `vector.len()`.
+pub fn make_record(
+    chunk_idx: u8,
+    doc_idx: u8,
+    vector: Vec<f32>,
+    text: &str,
+    heading: &[&str],
+    model: &str,
+) -> VectorRecord {
+    let dim = vector.len();
+    let chunk_id = ChunkId(format!("{:032x}", 0x1100u32 + chunk_idx as u32));
+    let doc_id = DocumentId(format!("{:032x}", 0xd0c0u32 + doc_idx as u32));
+    let embedding_id =
+        EmbeddingId(format!("{:032x}", 0xeeee0000u32 + chunk_idx as u32));
+    VectorRecord {
+        chunk_id,
+        embedding_id,
+        vector,
+        doc_id,
+        text: text.to_string(),
+        heading_path: heading.iter().map(|s| s.to_string()).collect(),
+        model_id: EmbeddingModelId(model.to_string()),
+        model_version: EmbeddingVersion("v1".to_string()),
+        dimensions: dim,
+    }
+}
--- a/crates/kb-store-vector/tests/fixtures/vector/run-1.json
+++ b/crates/kb-store-vector/tests/fixtures/vector/run-1.json
@@ -0,0 +1,4 @@
+{
+  "_comment": "PLACEHOLDER — regenerate via `KB_UPDATE_SNAPSHOTS=1 cargo test -p kb-store-vector -- --ignored snapshot` on an AVX-capable host. Until then the snapshot test panics with a clear 'placeholder' message.",
+  "hits": []
+}
--- a/crates/kb-store-vector/tests/snapshot.rs
+++ b/crates/kb-store-vector/tests/snapshot.rs
@@ -0,0 +1,119 @@
+//! Snapshot test: a fixed corpus + fixed query produces a stable
+//! `Vec<VectorHit>` JSON. Pinning the snapshot here catches accidental
+//! drift in score scaling, payload shape, or top-k ordering.
+//!
+//! This test is `#[ignore]` and requires AVX-capable hardware. Run
+//! with `cargo test -p kb-store-vector -- --ignored snapshot`.
+//!
+//! The committed fixture at `tests/fixtures/vector/run-1.json` is a
+//! placeholder until first regenerated on AVX hardware. The test
+//! detects the placeholder via its `_comment` field and panics with
+//! a clear "regenerate me" message — see `assert_no_placeholder`
+//! below.
+
+use std::path::PathBuf;
+
+use kb_core::{SearchFilters, VectorStore};
+use serde_json::json;
+
+mod common;
+use common::{TestEnv, make_record, require_avx_or_panic};
+
+const MODEL: &str = "snapshot-model";
+
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn vector_hits_snapshot_run_1() {
+    require_avx_or_panic();
+    let env = TestEnv::new();
+    // Fixed deterministic corpus: 4 unit-norm vectors, each with a
+    // known doc / chunk / heading. The query points squarely at
+    // chunk 0 so the expected ordering is 0, then the others by
+    // distance from dir(0).
+    let corpus = vec![
+        (0u8, vec![1.0_f32, 0.0, 0.0, 0.0], "alpha", &["A"][..]),
+        (1u8, vec![0.95_f32, 0.31, 0.0, 0.0], "beta", &["A", "B"][..]),
+        (2u8, vec![0.0_f32, 1.0, 0.0, 0.0], "gamma", &["B"][..]),
+        (3u8, vec![0.0_f32, 0.0, 1.0, 0.0], "delta", &[][..]),
+    ];
+
+    let mut recs = Vec::new();
+    for (i, vec, text, headings) in &corpus {
+        let rec = make_record(*i, *i, vec.clone(), text, headings, MODEL);
+        env.seed_chunk(
+            &rec.chunk_id.0,
+            &rec.doc_id.0,
+            &format!("notes/{i}.md"),
+            "en",
+            &[],
+            "primary",
+        );
+        recs.push(rec);
+    }
+    env.vector.upsert(&recs).unwrap();
+
+    let q = vec![1.0_f32, 0.0, 0.0, 0.0];
+    let hits = env.vector.search(&q, 3, &SearchFilters::default()).unwrap();
+
+    // The snapshot pins:
+    //   - top-3 chunk_id ordering (by score desc)
+    //   - payload shape: { doc_id, text, heading_path }
+    //   - that scores live in [0, 1] and are sorted descending
+    let actual = json!(
+        hits.iter().map(|h| json!({
+            "chunk_id": h.chunk_id.0,
+            "score_in_unit_interval": (0.0..=1.0).contains(&h.score),
+            "payload": h.payload,
+        })).collect::<Vec<_>>()
+    );
+
+    let fixture = PathBuf::from(env!("CARGO_MANIFEST_DIR"))
+        .join("tests")
+        .join("fixtures")
+        .join("vector")
+        .join("run-1.json");
+
+    if std::env::var_os("KB_UPDATE_SNAPSHOTS").is_some() {
+        std::fs::create_dir_all(fixture.parent().unwrap()).unwrap();
+        std::fs::write(&fixture, serde_json::to_string_pretty(&actual).unwrap())
+            .unwrap();
+        return;
+    }
+
+    let expected: serde_json::Value =
+        serde_json::from_str(&std::fs::read_to_string(&fixture).unwrap_or_else(
+            |_| panic!(
+                "missing snapshot fixture at {}; run with KB_UPDATE_SNAPSHOTS=1 to create",
+                fixture.display()
+            ),
+        ))
+        .unwrap();
+
+    // Refuse to silently "pass" when the fixture is the committed
+    // placeholder. The placeholder JSON carries a `_comment` field
+    // with regeneration instructions; production fixtures (a captured
+    // hits array) do not.
+    if expected.get("_comment").is_some() {
+        panic!(
+            "snapshot fixture is a placeholder — regenerate on AVX hardware then commit. \
+             Path: {}. To regenerate: \
+             `KB_UPDATE_SNAPSHOTS=1 cargo test -p kb-store-vector -- --ignored snapshot`.",
+            fixture.display()
+        );
+    }
+
+    assert_eq!(
+        actual, expected,
+        "snapshot drift; rerun with KB_UPDATE_SNAPSHOTS=1 to regenerate"
+    );
+
+    // Independent guard: scores must be non-increasing.
+    for w in hits.windows(2) {
+        assert!(
+            w[0].score >= w[1].score,
+            "scores not in descending order: {} then {}",
+            w[0].score,
+            w[1].score
+        );
+    }
+}
--- a/crates/kb-store-vector/tests/upsert_search.rs
+++ b/crates/kb-store-vector/tests/upsert_search.rs
@@ -0,0 +1,374 @@
+//! Integration tests for `LanceVectorStore` covering ensure_table,
+//! upsert, search, dimension mismatch, filters, model isolation, and
+//! determinism.
+//!
+//! Every test in this file is `#[ignore]` and requires an AVX-capable
+//! x86_64 host. Run with:
+//!
+//! ```text
+//! cargo test -p kb-store-vector -- --ignored
+//! ```
+//!
+//! See `tests/common/mod.rs` for the full rationale.
+
+use kb_core::{EmbeddingModelId, SearchFilters, VectorStore};
+use kb_store_sqlite::EmbeddingRecordRow;
+use rusqlite::params;
+use time::OffsetDateTime;
+
+mod common;
+use common::{TestEnv, make_record, require_avx_or_panic};
+
+const MODEL: &str = "test-model";
+
+/// Helper: produce a unit-norm 4-D vector pointing in one of four
+/// directions. The sign pattern keeps cosine similarities cleanly
+/// distinct so search ordering tests don't depend on float jitter.
+fn dir(idx: u8) -> Vec<f32> {
+    match idx {
+        0 => vec![1.0, 0.0, 0.0, 0.0],
+        1 => vec![0.0, 1.0, 0.0, 0.0],
+        2 => vec![0.0, 0.0, 1.0, 0.0],
+        _ => vec![0.0, 0.0, 0.0, 1.0],
+    }
+}
+
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn ensure_table_idempotent_returns_same_index_id() {
+    require_avx_or_panic();
+    let env = TestEnv::new();
+    let model = EmbeddingModelId(MODEL.to_string());
+    let id1 = env.vector.ensure_table(&model, 4).unwrap();
+    let id2 = env.vector.ensure_table(&model, 4).unwrap();
+    assert_eq!(id1, id2);
+}
+
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn search_before_upsert_returns_empty() {
+    require_avx_or_panic();
+    let env = TestEnv::new();
+    let hits = env
+        .vector
+        .search(&dir(0), 5, &SearchFilters::default())
+        .unwrap();
+    assert!(hits.is_empty());
+}
+
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn upsert_ten_then_search_returns_five() {
+    require_avx_or_panic();
+    let env = TestEnv::new();
+    let mut recs = Vec::new();
+    for i in 0..10u8 {
+        // 4-D vectors clustered near dir(0) for the first half, dir(1)
+        // for the rest, with small per-row jitter so they stay
+        // distinct in the index.
+        let mut v = if i < 5 { dir(0) } else { dir(1) };
+        v[3] = (i as f32) * 0.001;
+        let rec = make_record(i, i, v, &format!("text-{i}"), &["A"], MODEL);
+        env.seed_chunk(
+            &rec.chunk_id.0,
+            &rec.doc_id.0,
+            &format!("notes/{i}.md"),
+            "en",
+            &[],
+            "primary",
+        );
+        recs.push(rec);
+    }
+    env.vector.upsert(&recs).unwrap();
+
+    // 1:1 alignment check: every record has a committed embedding row.
+    {
+        let conn = env.sqlite.read_conn();
+        let count: i64 = conn
+            .query_row(
+                "SELECT COUNT(*) FROM embedding_records WHERE status = 'committed'",
+                [],
+                |r| r.get(0),
+            )
+            .unwrap();
+        assert_eq!(count, 10);
+    }
+
+    let hits = env
+        .vector
+        .search(&dir(0), 5, &SearchFilters::default())
+        .unwrap();
+    assert_eq!(hits.len(), 5, "expected 5 hits, got {}", hits.len());
+
+    // Top hits should be from the first half (clustered around dir(0)).
+    // make_record lays chunk_idx into the low bits of `0x1100 + i`, so
+    // `chunk_idx = u32::from_str_radix(last4, 16) - 0x1100`. The first
+    // half (chunk_idx < 5) lives in 0x1100..=0x1104.
+    for h in &hits {
+        let suffix_hex = &h.chunk_id.0[h.chunk_id.0.len() - 4..];
+        let idx = u32::from_str_radix(suffix_hex, 16).unwrap();
+        let chunk_idx = idx - 0x1100;
+        assert!(
+            chunk_idx < 5,
+            "top-5 hit unexpectedly came from second cluster: idx={chunk_idx}"
+        );
+    }
+}
+
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn dimension_mismatch_errors_and_writes_nothing() {
+    require_avx_or_panic();
+    let env = TestEnv::new();
+    let model = EmbeddingModelId(MODEL.to_string());
+
+    // First populate a 4-D table with one row so it exists on disk.
+    let r0 = make_record(0, 0, dir(0), "first", &[], MODEL);
+    env.seed_chunk(&r0.chunk_id.0, &r0.doc_id.0, "notes/0.md", "en", &[], "primary");
+    env.vector.upsert(&[r0]).unwrap();
+    assert_eq!(env.vector.ensure_table(&model, 4).unwrap(), env.vector.ensure_table(&model, 4).unwrap());
+
+    // Now manually open the same table_name path and try to upsert
+    // an 8-D vector through `upsert` — the table name function bakes
+    // dim into the name, so the only way to drive the real
+    // record-vs-table mismatch is to corrupt `dimensions` so the
+    // table_name is the existing 4-D table, but the embedded vector
+    // is 8-D. Spec line 94: must error, write nothing extra.
+    let mut bad = make_record(1, 1, vec![0.1_f32; 8], "second", &[], MODEL);
+    // Pretend this is a 4-D vector for table-name purposes; the
+    // build_batch then enforces that vector.len() == dim and bails.
+    bad.dimensions = 4;
+    env.seed_chunk(&bad.chunk_id.0, &bad.doc_id.0, "notes/1.md", "en", &[], "primary");
+
+    let bad_chunk = bad.chunk_id.0.clone();
+    let err = env.vector.upsert(&[bad]).unwrap_err();
+    let msg = format!("{err:#}");
+    assert!(
+        msg.to_lowercase().contains("dim")
+            || msg.contains("does not match table dim"),
+        "unexpected error message: {msg}"
+    );
+
+    // The phase-1 row may have landed before phase 2 detected the
+    // mismatch — but the on-disk Lance table must NOT contain the
+    // bad record. So we assert that no `committed` row corresponds
+    // to chunk_id of the bad record.
+    let conn = env.sqlite.read_conn();
+    let committed: i64 = conn
+        .query_row(
+            "SELECT COUNT(*) FROM embedding_records WHERE chunk_id = ? AND status = 'committed'",
+            rusqlite::params![bad_chunk],
+            |r| r.get(0),
+        )
+        .unwrap();
+    assert_eq!(committed, 0, "bad record reached committed state despite dim mismatch");
+}
+
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn filter_tags_any_drops_non_matching_docs() {
+    require_avx_or_panic();
+    let env = TestEnv::new();
+
+    // Two docs: one with tag "ko-style", one without.
+    let r_a = make_record(0xaa, 0xaa, dir(0), "alpha", &[], MODEL);
+    let r_b = make_record(0xbb, 0xbb, dir(0), "beta", &[], MODEL);
+    env.seed_chunk(
+        &r_a.chunk_id.0,
+        &r_a.doc_id.0,
+        "notes/a.md",
+        "en",
+        &["ko-style"],
+        "primary",
+    );
+    env.seed_chunk(
+        &r_b.chunk_id.0,
+        &r_b.doc_id.0,
+        "notes/b.md",
+        "en",
+        &["other"],
+        "primary",
+    );
+    let expected_doc_id = r_a.doc_id.0.clone();
+    env.vector.upsert(&[r_a, r_b]).unwrap();
+
+    let filters = SearchFilters {
+        tags_any: vec!["ko-style".to_string()],
+        ..Default::default()
+    };
+    let hits = env.vector.search(&dir(0), 10, &filters).unwrap();
+    assert_eq!(hits.len(), 1, "expected only the tagged doc to match");
+    let payload = &hits[0].payload;
+    assert_eq!(payload["doc_id"], expected_doc_id);
+}
+
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn model_isolation_two_models_two_directories() {
+    require_avx_or_panic();
+    let env = TestEnv::new();
+    let r1 = make_record(0xaa, 0xaa, dir(0), "alpha", &[], "model-A");
+    env.seed_chunk(
+        &r1.chunk_id.0,
+        &r1.doc_id.0,
+        "notes/a.md",
+        "en",
+        &[],
+        "primary",
+    );
+    let chunk_id = r1.chunk_id.0.clone();
+    env.vector.upsert(&[r1]).unwrap();
+
+    // Same chunk_id, different model — should land in a separate table.
+    let mut r2 = make_record(0xaa, 0xaa, dir(0), "alpha", &[], "model-B");
+    r2.embedding_id = kb_core::EmbeddingId(
+        "ee01ee01ee01ee01ee01ee01ee01ee01".to_string(),
+    );
+    env.vector.upsert(&[r2]).unwrap();
+
+    // Two on-disk Lance directories, distinguished by table name.
+    let lancedb_root = env.data_dir().join("lancedb");
+    let entries: Vec<_> = std::fs::read_dir(&lancedb_root)
+        .unwrap()
+        .filter_map(Result::ok)
+        .map(|e| e.file_name().to_string_lossy().into_owned())
+        .collect();
+    let a_count = entries
+        .iter()
+        .filter(|e| e.contains("model-A"))
+        .count();
+    let b_count = entries
+        .iter()
+        .filter(|e| e.contains("model-B"))
+        .count();
+    assert!(a_count >= 1, "model-A table missing: {entries:?}");
+    assert!(b_count >= 1, "model-B table missing: {entries:?}");
+
+    // Two embedding_records rows for the same chunk_id, one per model.
+    let conn = env.sqlite.read_conn();
+    let count: i64 = conn
+        .query_row(
+            "SELECT COUNT(*) FROM embedding_records WHERE chunk_id = ?",
+            params![chunk_id],
+            |r| r.get(0),
+        )
+        .unwrap();
+    assert_eq!(count, 2);
+}
+
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn determinism_same_query_same_top_k() {
+    require_avx_or_panic();
+    let env = TestEnv::new();
+    let recs: Vec<_> = (0..6u8)
+        .map(|i| {
+            let mut v = dir(i % 4);
+            v[3] = (i as f32) * 0.001;
+            let rec = make_record(i, i, v, &format!("t-{i}"), &[], MODEL);
+            env.seed_chunk(
+                &rec.chunk_id.0,
+                &rec.doc_id.0,
+                &format!("notes/{i}.md"),
+                "en",
+                &[],
+                "primary",
+            );
+            rec
+        })
+        .collect();
+    env.vector.upsert(&recs).unwrap();
+
+    let q = dir(0);
+    let h1 = env.vector.search(&q, 4, &SearchFilters::default()).unwrap();
+    let h2 = env.vector.search(&q, 4, &SearchFilters::default()).unwrap();
+    let ids1: Vec<_> = h1.iter().map(|h| h.chunk_id.0.clone()).collect();
+    let ids2: Vec<_> = h2.iter().map(|h| h.chunk_id.0.clone()).collect();
+    assert_eq!(ids1, ids2);
+}
+
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn upsert_retry_promotes_pending_to_committed() {
+    // Crash-recovery contract: a phase-1 row that was already
+    // committed by a prior batch is left alone by phase-3, but a
+    // pending row gets retried and reaches committed once Lance
+    // accepts it.
+    //
+    // Construction of the "crash" state:
+    //
+    //   1. Stage a row directly via the SQLite phase-1 helper
+    //      (`put_embedding_records_pending`). NO Lance write happens
+    //      here — this is exactly the on-disk state after a crash
+    //      between phase 1 and phase 2. Confirm the row is at
+    //      `status='pending'` before doing anything else.
+    //
+    //   2. Run `LanceVectorStore::upsert` with a `VectorRecord` whose
+    //      `embedding_id` matches the pending row. Phase 1's
+    //      `INSERT OR REPLACE` is idempotent here (same row payload),
+    //      phase 2 actually writes to Lance for the first time, and
+    //      phase 3 flips the row to 'committed'.
+    //
+    //   3. Verify status='committed' and vector_committed=1.
+    //
+    // This actually exercises the "rows stuck at pending get promoted
+    // on next upsert" semantics — the previous version pre-seeded via
+    // raw SQL but then the same upsert call overwrote the seed via
+    // INSERT OR REPLACE before phase 2 ran, so the recovery path
+    // never executed.
+    require_avx_or_panic();
+    let env = TestEnv::new();
+    let rec = make_record(0xaa, 0xaa, dir(0), "alpha", &[], MODEL);
+    let chunk_id = rec.chunk_id.0.clone();
+    let doc_id = rec.doc_id.0.clone();
+    let embedding_id = rec.embedding_id.0.clone();
+    env.seed_chunk(&chunk_id, &doc_id, "notes/a.md", "en", &[], "primary");
+
+    // Phase 1 only — go through the same kb-store-sqlite helper that
+    // `LanceVectorStore::upsert` uses internally. No Lance write
+    // happens, so this models "crashed between phase 1 and phase 2".
+    let pending_row = EmbeddingRecordRow {
+        embedding_id: embedding_id.clone(),
+        chunk_id: chunk_id.clone(),
+        model_id: MODEL.to_string(),
+        model_version: "v1".to_string(),
+        dimensions: 4,
+        lance_table: format!("chunk_embeddings_{MODEL}_4"),
+        created_at: OffsetDateTime::UNIX_EPOCH,
+    };
+    env.sqlite
+        .put_embedding_records_pending(std::slice::from_ref(&pending_row))
+        .unwrap();
+
+    // Sanity: the row is staged but NOT yet committed and Lance has
+    // no record of it.
+    {
+        let conn = env.sqlite.read_conn();
+        let (status, committed): (String, i64) = conn
+            .query_row(
+                "SELECT status, vector_committed FROM embedding_records WHERE embedding_id = ?",
+                params![embedding_id],
+                |r| Ok((r.get(0)?, r.get(1)?)),
+            )
+            .unwrap();
+        assert_eq!(status, "pending", "row should be at status=pending after phase-1-only");
+        assert_eq!(committed, 0);
+    }
+
+    // Now run upsert with the matching record. Phase 1's INSERT OR
+    // REPLACE is a no-op equivalent (same row payload), phase 2 lands
+    // the Lance row for the first time, phase 3 promotes
+    // status='committed'.
+    env.vector.upsert(&[rec]).unwrap();
+
+    let conn = env.sqlite.read_conn();
+    let (status, committed): (String, i64) = conn
+        .query_row(
+            "SELECT status, vector_committed FROM embedding_records WHERE embedding_id = ?",
+            params![embedding_id],
+            |r| Ok((r.get(0)?, r.get(1)?)),
+        )
+        .unwrap();
+    assert_eq!(status, "committed");
+    assert_eq!(committed, 1);
+}
--- a/migrations/V003__embedding_status.sql
+++ b/migrations/V003__embedding_status.sql
@@ -0,0 +1,46 @@
+-- V003__embedding_status.sql — additive embedding lifecycle markers (§5.6).
+--
+-- P3-3 introduces a two-phase write to `embedding_records` paired with
+-- a Lance MergeInsert. Phase 1 inserts the row at `status='pending'`;
+-- phase 2 issues the Lance write; phase 3 flips the row to
+-- `status='committed'`. `search` joins back through this table with
+-- `WHERE status='committed'` so partial-write Lance rows never surface
+-- to callers, and a crashed phase 2 retry simply re-runs against the
+-- still-pending row (Lance MergeInsert dedupes on `chunk_id`).
+--
+-- The third state, `tombstone`, is reserved for the deletion pipeline:
+-- when a chunk row goes away, the matching Lance row should also be
+-- garbage-collected, but the GC scheduler is out of P3-3 scope. The
+-- BEFORE DELETE trigger below stages the marker so a future GC has a
+-- well-defined claim; see the comment block on the trigger for why
+-- it currently coexists with V001's `ON DELETE CASCADE` FK rather than
+-- replacing it.
+
+ALTER TABLE embedding_records ADD COLUMN status TEXT NOT NULL DEFAULT 'pending'
+  CHECK (status IN ('pending','committed','tombstone'));
+
+ALTER TABLE embedding_records ADD COLUMN vector_committed INTEGER NOT NULL DEFAULT 0;
+
+CREATE INDEX idx_embed_status ON embedding_records(status);
+
+-- Tombstone trigger.
+--
+-- Intent: when a `chunks` row is about to be deleted, mark its
+-- dependent `embedding_records` rows as `status='tombstone'` so a later
+-- GC pass can drop the matching Lance rows in lockstep.
+--
+-- Caveat (carried into a future migration): V001 declared the FK as
+-- `chunk_id REFERENCES chunks(chunk_id) ON DELETE CASCADE`. SQLite's
+-- documented order is "BEFORE-DELETE trigger fires first, then CASCADE
+-- runs", so this UPDATE will land a `tombstone` value that is
+-- immediately followed by the CASCADE removing the row. The trigger is
+-- therefore best-effort under the current FK; the only path that
+-- actually preserves the tombstone is to drop the CASCADE (table
+-- recreation, since SQLite has no DROP CONSTRAINT) — that is queued
+-- for a P+ migration once the GC scheduler exists and we have actual
+-- production rows to migrate. Keeping the trigger here documents the
+-- design intent and gives the deletion-pipeline observer a stable hook
+-- to wire into.
+CREATE TRIGGER chunks_bd_tombstone_embeddings BEFORE DELETE ON chunks BEGIN
+  UPDATE embedding_records SET status='tombstone' WHERE chunk_id = old.chunk_id;
+END;