feat(expansion): doc-side expansion 별칭 개별 dense 벡터 + 파생물 캐시(V012)

별칭을 줄별 개별 dense 벡터(sentinel `{chunk}#alias#N`)로 색인하고
boilerplate 청크는 별칭 생성을 skip. 묶음 1벡터 방식은 평균화로 특정
표현이 희석돼 오히려 회귀(13/18)했던 것을 폐기. 변형 일관성 14/18 →
16/18, mean_spread@10 0.222 → 0.111 (나무위키 ~1000 문서 CS corpus).
`kebab-core::strip_alias_suffix` 가 suffix 형과 per-alias 형 둘 다 처리.

파생물 캐시(V012): embedding 벡터 + 별칭 LLM 결과를 청크 내용 해시
키로 캐싱해 재색인 시 내용 불변 청크의 재계산을 skip. cache_key =
blake3(kind ‖ text_blake3 ‖ version_key)[:32], version_key 에
model/prompt/dimensions 포함 → §9 cascade 와 정합(버전 bump 시 자동
miss). 측정: 정답 3개 cold 1879s → warm 13s ≈ 145배. 순수 가산이라
corpus_revision bump 없음. search/ask 는 kebab.sqlite+lancedb 만으로
동작 → 외부 서버 색인 후 DB 만 복사하는 이식 워크플로 가능.

V012 schema migration + 신규 surface 로 workspace version 0.20.2 →
0.21.0 (minor) bump. README/HANDOFF/ARCHITECTURE/HOTFIXES sync.
known limitation: stack·svm 설명형 2개 잔존 + grounded 판정이 부분
인용을 grounded 로 오분류(후속 후보).

측정 상세: docs/superpowers/handoffs/2026-05-31-namu-wiki-alias-cache-study.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-31 08:24:04 +00:00
parent 0282a81c67
commit a8fd76499c
18 changed files with 1000 additions and 71 deletions

View File

@@ -0,0 +1,192 @@
//! Content-hash derivation cache store (design 2026-05-31 §3.2 / §3.5).
//!
//! Backs the `derivation_cache` table (`V012`). The cache stores expensive
//! ingest derivations (embedding vectors, LLM aliases, optional Korean
//! tokens) keyed by `derivation_cache_key` (§3.1). It is a pure performance
//! layer: corruption / deletion only forces recomputation, never wrong
//! results (§3.5). Timestamps follow the same RFC3339 `OffsetDateTime`
//! formatting the asset / document / embedding writers use.
use anyhow::{Context, Result};
use rusqlite::{OptionalExtension, params};
use time::OffsetDateTime;
use time::format_description::well_known::Rfc3339;
use crate::error::StoreError;
use crate::store::SqliteStore;
impl SqliteStore {
/// Look up a cached derivation payload by its content-hash key.
///
/// Pure read — does **not** bump `last_used_at`. Callers that want LRU
/// freshness on a hit collect the hit keys and call [`Self::touch`] once
/// per batch (cheaper than a write per `get`).
pub fn derivation_cache_get(&self, cache_key: &str) -> Result<Option<Vec<u8>>> {
let conn = self.lock_conn();
let payload: Option<Vec<u8>> = conn
.query_row(
"SELECT payload FROM derivation_cache WHERE cache_key = ?",
params![cache_key],
|row| row.get::<_, Vec<u8>>(0),
)
.optional()
.map_err(StoreError::from)
.context("derivation_cache_get")?;
Ok(payload)
}
/// Insert (or overwrite) a cached derivation payload.
///
/// `INSERT OR REPLACE` so a re-computation of the same key (e.g. after a
/// manual cache clear, or a non-deterministic LLM regenerating) refreshes
/// `created_at` / `last_used_at` to the new attempt. The key already folds
/// every version-cascade input (§3.1), so an overwrite is always the same
/// logical derivation.
pub fn derivation_cache_put(&self, cache_key: &str, kind: &str, payload: &[u8]) -> Result<()> {
let now = OffsetDateTime::now_utc()
.format(&Rfc3339)
.context("format derivation_cache.created_at")?;
let conn = self.lock_conn();
conn.execute(
"INSERT OR REPLACE INTO derivation_cache
(cache_key, kind, payload, created_at, last_used_at)
VALUES (?, ?, ?, ?, ?)",
params![cache_key, kind, payload, now, now],
)
.map_err(StoreError::from)
.context("derivation_cache_put")?;
Ok(())
}
/// Bump `last_used_at` for the given hit keys (LRU freshness, §3.5).
///
/// Run in a single transaction. Missing keys are a no-op. Called once per
/// ingest batch with the keys that hit, so the GC pass keeps live chunks.
pub fn derivation_cache_touch(&self, keys: &[String]) -> Result<()> {
if keys.is_empty() {
return Ok(());
}
let now = OffsetDateTime::now_utc()
.format(&Rfc3339)
.context("format derivation_cache.last_used_at")?;
let mut conn = self.lock_conn();
let tx = conn.transaction().map_err(StoreError::from)?;
{
let mut stmt = tx
.prepare("UPDATE derivation_cache SET last_used_at = ? WHERE cache_key = ?")
.map_err(StoreError::from)?;
for key in keys {
stmt.execute(params![now, key])
.map_err(StoreError::from)
.context("derivation_cache_touch")?;
}
}
tx.commit().map_err(StoreError::from)?;
Ok(())
}
/// Delete cache entries whose `last_used_at` is older than `ttl_days`
/// (§3.5 lightweight GC). Returns the number of rows removed.
///
/// `ttl_days <= 0` is a no-op guard (never wipe the whole cache by an
/// accidental zero TTL).
pub fn derivation_cache_gc(&self, ttl_days: i64) -> Result<usize> {
if ttl_days <= 0 {
return Ok(0);
}
let cutoff = (OffsetDateTime::now_utc() - time::Duration::days(ttl_days))
.format(&Rfc3339)
.context("format derivation_cache gc cutoff")?;
let conn = self.lock_conn();
let removed = conn
.execute(
"DELETE FROM derivation_cache WHERE last_used_at < ?",
params![cutoff],
)
.map_err(StoreError::from)
.context("derivation_cache_gc")?;
Ok(removed)
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::store::SqliteStore;
fn open_store() -> (tempfile::TempDir, SqliteStore) {
let dir = tempfile::tempdir().unwrap();
let mut cfg = kebab_config::Config::defaults();
cfg.storage.data_dir = dir.path().to_string_lossy().into_owned();
let store = SqliteStore::open(&cfg).unwrap();
store.run_migrations().unwrap();
(dir, store)
}
#[test]
fn put_then_get_roundtrips() {
let (_d, store) = open_store();
store
.derivation_cache_put("key1", "embedding", &[1, 2, 3, 4])
.unwrap();
let got = store.derivation_cache_get("key1").unwrap();
assert_eq!(got, Some(vec![1, 2, 3, 4]));
}
#[test]
fn get_miss_returns_none() {
let (_d, store) = open_store();
assert_eq!(store.derivation_cache_get("absent").unwrap(), None);
}
#[test]
fn put_replaces_existing() {
let (_d, store) = open_store();
store.derivation_cache_put("k", "alias", b"old").unwrap();
store.derivation_cache_put("k", "alias", b"new").unwrap();
assert_eq!(
store.derivation_cache_get("k").unwrap(),
Some(b"new".to_vec())
);
}
#[test]
fn touch_missing_keys_is_noop() {
let (_d, store) = open_store();
store
.derivation_cache_touch(&["nope".to_string()])
.unwrap();
assert_eq!(store.derivation_cache_get("nope").unwrap(), None);
}
#[test]
fn gc_zero_ttl_is_noop() {
let (_d, store) = open_store();
store.derivation_cache_put("k", "embedding", b"x").unwrap();
assert_eq!(store.derivation_cache_gc(0).unwrap(), 0);
assert!(store.derivation_cache_get("k").unwrap().is_some());
}
#[test]
fn gc_removes_stale_entries() {
let (_d, store) = open_store();
store.derivation_cache_put("fresh", "embedding", b"x").unwrap();
// Backdate one row by 100 days via a direct UPDATE.
let old = (OffsetDateTime::now_utc() - time::Duration::days(100))
.format(&Rfc3339)
.unwrap();
{
let conn = store.lock_conn();
conn.execute(
"INSERT INTO derivation_cache (cache_key, kind, payload, created_at, last_used_at)
VALUES ('stale', 'embedding', ?, ?, ?)",
params![&b"y"[..], &old, &old],
)
.unwrap();
}
let removed = store.derivation_cache_gc(30).unwrap();
assert_eq!(removed, 1);
assert!(store.derivation_cache_get("stale").unwrap().is_none());
assert!(store.derivation_cache_get("fresh").unwrap().is_some());
}
}

View File

@@ -19,6 +19,7 @@
mod answers;
mod chat_sessions;
mod derivation_cache;
mod documents;
mod embeddings;
mod error;