- Add ScoreKind::Bm25 to LexicalRetriever::build_hit SearchHit construction
- Import ScoreKind from kebab_core in lexical.rs
- Add integration test lexical_retriever_hits_carry_bm25_score_kind to verify all
hits from LexicalRetriever carry score_kind == ScoreKind::Bm25
- Update lexical snapshot test baseline to include new score_kind field
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
filter_chunks helper in kebab-store-sqlite extended with the same 3
WHERE clauses as lexical. Vector still over-fetches k*2 then
post-filters via SqliteStore::filter_chunks; small k can return < k
hits when filters drop a lot — agent is expected to widen k or
paginate. AND combinator with existing filters.
- kebab-store-sqlite/src/filters.rs: media IN-list subquery, ingested_after
lexicographic >= compare, doc_id equality; mirrors lexical SQL arms
- 3 direct unit tests (filter_chunks_media_type/ingested_after/doc_id)
that run without AVX/Lance
- common/mod.rs: insert_doc / insert_doc_with_media / run_vector_search
helpers on HybridEnv for integration-test use
- hybrid.rs: 2 new #[ignore = "requires AVX..."] integration tests
(vector_filter_by_media, vector_filter_by_doc_id)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SQL WHERE clause extension. media uses CASE WHEN json_type='text'
to handle both unit (\`"markdown"\`) and tuple (\`{"image":"png"}\`)
MediaType serde shapes. ingested_after relies on RFC3339 lexicographic
ordering with UTC Z (per fb-32 ingest invariant). doc_id is a simple
equality. AND combinator with existing tags / lang / trust filters.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JOIN documents.updated_at. stale defaults to false; App facade
post-processes against config threshold.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>