Commit Graph

405 Commits

Author SHA1 Message Date
th-kim0823
6a067e3ab1 feat(search): HybridRetriever::search_with_trace (fb-37) 2026-05-10 12:38:53 +09:00
th-kim0823
231d80e82d feat(stats): media/lang/bytes/stale fields on schema.v1.stats (fb-37)
Extends CountSummary with media_breakdown, lang_breakdown, stale_doc_count
fields populated via stats_ext::breakdowns(). Adds count_summary_with_threshold
for callers that need real stale counts. Mirrors all new fields onto the
wire-bound Stats struct in kebab-app::schema with #[serde(default)] for
backwards-compat. Also fixes search_budget_integration.rs for the trace field
added to SearchOpts in Task 1.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 12:34:57 +09:00
th-kim0823
69c6e23432 feat(store): breakdowns + index_bytes helpers (fb-37)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 12:24:43 +09:00
th-kim0823
1e943f21dc feat(core): SearchTrace + IndexBytes types + SearchOpts.trace (fb-37)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 12:17:04 +09:00
th-kim0823
84287d0ef6 fix(fb-36): address PR #127 round 1 review
- ingested_after: convert OffsetDateTime to UTC before formatting
  so non-Z offsets compare correctly against UTC TEXT storage
  (lexical.rs + filters.rs)
- README: --tag is repeatable-only, not csv (only --media is csv)
- test(cli): add multi-value --tag OR-within IN-list coverage
- test(store): add UTC-offset regression test for ingested_after
- mcp: use ERROR_V1_ID const instead of hardcoded "error.v1"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 04:47:55 +09:00
th-kim0823
b06f4654e7 feat(mcp): kebab__search filter inputs (fb-36)
7 new optional inputs on SearchInput: tags, lang, path_glob,
trust_min, media, ingested_after, doc_id. Validation surfaces as
error.v1 code = invalid_input via StructuredError. Dispatch builds
SearchFilters from the inputs and forwards through the existing
search_with_opts_with_config facade.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 04:11:27 +09:00
th-kim0823
4e0379c04f test(cli): wire_search_filters — lexical-only integration tests (fb-36)
Cover: --doc-id scoping, --ingested-after validation error,
--media md alias, --tag repeatable + frontmatter parsing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 04:06:21 +09:00
th-kim0823
6a18847892 feat(cli): kebab search filter flags (fb-36)
7 new flags: --tag (repeatable), --lang, --path-glob,
--trust-min (value_enum), --media (csv with `md` alias),
--ingested-after (RFC3339; config_invalid on parse fail),
--doc-id. Dispatch translates clap values into SearchFilters
and propagates structured errors through the existing
StructuredError wrapper from fb-34.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 03:57:55 +09:00
th-kim0823
c6cc1e2bfe feat(search/vector): media / ingested_after / doc_id filters (fb-36)
filter_chunks helper in kebab-store-sqlite extended with the same 3
WHERE clauses as lexical. Vector still over-fetches k*2 then
post-filters via SqliteStore::filter_chunks; small k can return < k
hits when filters drop a lot — agent is expected to widen k or
paginate. AND combinator with existing filters.

- kebab-store-sqlite/src/filters.rs: media IN-list subquery, ingested_after
  lexicographic >= compare, doc_id equality; mirrors lexical SQL arms
- 3 direct unit tests (filter_chunks_media_type/ingested_after/doc_id)
  that run without AVX/Lance
- common/mod.rs: insert_doc / insert_doc_with_media / run_vector_search
  helpers on HybridEnv for integration-test use
- hybrid.rs: 2 new #[ignore = "requires AVX..."] integration tests
  (vector_filter_by_media, vector_filter_by_doc_id)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 03:50:56 +09:00
th-kim0823
86475e5ba2 fix(search/lexical): use std::iter::repeat_n (clippy)
Per code review on 2c80e2a. manual-repeat-n lint triggers
for Rust 1.94+ when repeat().take() can be expressed as
repeat_n directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 03:43:51 +09:00
th-kim0823
2c80e2ad91 feat(search/lexical): media / ingested_after / doc_id filters (fb-36)
SQL WHERE clause extension. media uses CASE WHEN json_type='text'
to handle both unit (\`"markdown"\`) and tuple (\`{"image":"png"}\`)
MediaType serde shapes. ingested_after relies on RFC3339 lexicographic
ordering with UTC Z (per fb-32 ingest invariant). doc_id is a simple
equality. AND combinator with existing tags / lang / trust filters.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 03:41:02 +09:00
th-kim0823
d3f38c76e9 feat(core): SearchFilters gains media / ingested_after / doc_id (fb-36)
3 additive optional fields. #[serde(default)] preserves
backwards compat for older JSON without the new keys.
MEDIA_KINDS const exposes canonical "markdown"/"pdf"/"image"/
"audio"/"other" labels for downstream alias normalization.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 03:36:45 +09:00
th-kim0823
b86b763dfb fix(fb-35): address PR #126 round 2 review
- wire schema: relax effective_end.minimum 1 → 0 + expand
  description to cover line-clamp + out-of-range sentinel
  (panic-fix R1 emits Some(0) when line_start=1 and range is
  beyond doc end — schema must accept it)
- tests: tighten first-chunk-target boundary test to assert ≤ 2
  total neighbors (3-chunk doc, N=2). Strict "first chunk →
  context_before empty" not assertable until chunks.ordinal
  column lands (R1 #9 architectural caveat)
- store: trim contradiction in list_chunk_ids_for_doc warning
  comment — drop "good enough for sequentially chunked
  markdown" phrase that conflicts with "hash sort dominates"
  paragraph above

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:55:29 +09:00
th-kim0823
7dddc1d706 fix(fb-35): address PR #126 round 1 review
- fetch_span: panic-fix on line_start > total / empty doc
  (return empty text + effective_end = line_start - 1 instead of
  out-of-bounds slice)
- truncated: reserved for budget-driven truncation only; line
  range clamp signaled via effective_end < line_end
- spec / SKILL.md / README: align rejection wording to "PDF /
  audio" (matches code; Image OCR allowed for span)
- store: warning comment on list_chunk_ids_for_doc — chunk_id
  hash sort does NOT preserve document position; real fix is a
  chunks.ordinal column, tracked as follow-up
- surrounding_chunks: saturating_add to defend against u32::MAX
  context arg on 32-bit targets
- tests: line_start > total returns empty + chunk context at
  doc boundary clamps lower bound

Deferred nits (follow-up): table-separator strict CommonMark form;
MCP per-mode strict validation; CLI chunk_id truncation in plain
output. None block correctness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:45:29 +09:00
th-kim0823
8d8f1c0294 test(cli): bump expected MCP tool count 6 → 7 for fb-35 fetch
cli_mcp_initialize_then_tools_list asserts the exact tools[]
count returned by tools/list. fb-35 added kebab__fetch as the
7th tool — bump the assertion accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:20:59 +09:00
th-kim0823
77bf19566c feat(mcp): kebab__fetch tool — chunk / doc / span (fb-35)
Mirrors CLI surface: same input shape, same fetch_result.v1
output. invalid_input error for missing kind-specific fields.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:11:37 +09:00
th-kim0823
beb40249a3 test(cli): wire_fetch — chunk/doc + chunk_not_found integration (fb-35)
3 lexical-only integration tests: chunk JSON shape, doc truncated
with --max-tokens, unknown chunk_id returns error.v1 with
code = chunk_not_found.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:06:14 +09:00
th-kim0823
0fffd69071 feat(cli): kebab fetch chunk / doc / span (fb-35)
JSON output is fetch_result.v1; plain output is human-friendly
labeled sections (chunk: before / target / after; doc/span: full
text + stderr truncated hint).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:01:56 +09:00
th-kim0823
1b9d89eb3a feat(app): App::fetch span mode + PDF/audio rejection (fb-35)
Line-based slice over fmt_canonical_to_markdown output.
PDF / audio source_type → span_not_supported StructuredError.
Out-of-range line_end clamps to total; effective_end reflects
post-budget trim. invalid_input on zero / inverted bounds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 23:54:22 +09:00
th-kim0823
7d1f855f7e feat(app): App::fetch doc mode with budget (fb-35)
Walks CanonicalDocument blocks, serializes to markdown, applies
chars/4 budget when opts.max_tokens is set. doc_not_found
preserved through StructuredError.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 23:48:40 +09:00
th-kim0823
610d29f053 feat(app): App::fetch chunk mode + markdown serializer (fb-35)
Chunk mode + +-N context. doc / span modes return placeholder
errors (filled by subsequent tasks). fmt_canonical_to_markdown
helper introduced now since doc mode (Task 4) consumes it.
Errors are typed StructuredError so classify preserves
chunk_not_found / doc_not_found through the wire layer.

Adds SqliteStore::list_chunk_ids_for_doc so the facade can derive
+-N neighbors without leaking direct rusqlite usage into kebab-app.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 23:44:51 +09:00
th-kim0823
9653592c16 feat(core): FetchQuery / FetchOpts / FetchResult / FetchKind (fb-35)
Domain types for `kebab fetch` 3 modes (chunk / doc / span). All
types Serialize so wire layers hand them through serde_json
directly. FetchKind is snake_case-renamed to match the wire
discriminator literal in fetch_result.v1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 23:35:21 +09:00
th-kim0823
e084b306e5 fix(fb-34): align next_cursor semantics with docs (PR #125 round 2)
Previous round-1 fix dropped the speculative cursor branch on
the truncated path, leaving a contradiction with the docs:
- snippet-only shrunk → cursor emitted (returned == k_effective)
- k-popped → cursor null (returned < k_effective)
But docs promised the opposite.

R2 resolution: emit cursor whenever more hits may be reachable
(either retriever filled the page OR budget popped hits — the
popped ones remain fetchable from offset+returned). Drop the
artificial "widen vs paginate" copy; truncated and next_cursor
are now independent signals — caller may do either or both.

Updates: app.rs::search_with_opts logic + SearchResponse doc +
schema description + SKILL.md two bullets + max_tokens=0 test
asserts cursor IS emitted on k-pop case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 21:07:04 +09:00
th-kim0823
f485608108 fix(fb-34): address PR #125 round 1 review
- error_wire: StructuredError wrapper preserves ErrorV1 through
  anyhow → classify pipeline. Adds downcast short-circuit so
  cursor::decode's typed code = "stale_cursor" reaches the wire
  instead of being string-formatted to code = "generic".
- app: search_with_opts now wraps cursor::decode error in
  StructuredError instead of anyhow! string format.
- test: error_wire pins both negative (bare anyhow → not
  stale_cursor) AND positive (StructuredError → stale_cursor)
  invariants. CLI integration test runs end-to-end and asserts
  error.v1.code on stderr.
- app: next_cursor only emitted on full-page (k-pop) path; drop
  speculative emit on snippet-only truncation that would point at
  a different page than the agent expected.
- cursor: differentiate malformed-base64 / malformed-payload /
  revision-mismatch error messages; all keep code = stale_cursor.
- test: cursor_rejected fixture uses .expect() to fail loud on
  cursor non-emission instead of silent skip.
- test: max_tokens=0 → 1-hit floor + truncated=true.
- docs: SKILL.md + schema description distinguish snippet-shrink
  (widen) vs k-pop (paginate) truncated cases. HOTFIXES notes
  --no-cache semantic shift (cached path + clear vs uncached path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:49:27 +09:00
th-kim0823
e1fcea6313 chore: clippy fix for fb-34 — allow result_large_err on cursor::decode
ErrorV1 is the workspace wire error struct; boxing here would
force every call site to deref through a Box for no win — the
err-path is rare. Single allow at the function level.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:20:36 +09:00
th-kim0823
5e0cff1b92 feat(mcp): search tool emits search_response.v1 + budget inputs (fb-34)
SearchInput gains max_tokens / snippet_chars / cursor (all optional).
Output wrapped in search_response.v1 to match CLI; existing
tools_call_search test updated to read v["hits"] instead of the bare
array.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:12:05 +09:00
th-kim0823
603061fb86 test(cli): wire_search_response + budget integration (fb-34)
4 lexical-only tests covering search_response.v1 wrapper shape,
--max-tokens truncation, --cursor pagination, plain stderr hint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:09:01 +09:00
th-kim0823
21220f6d39 feat(cli): kebab search --max-tokens / --snippet-chars / --cursor (fb-34)
JSON output wrapped in search_response.v1 (breaking — agent must
adapt). Plain output unchanged + [truncated; use --cursor X]
stderr hint when budget tripped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:02:50 +09:00
th-kim0823
af80cedd81 feat(app): App::search_with_opts + SearchResponse (fb-34)
Budget loop: snippet shorten → k pop → ≥1 hit floor. Cursor
encode/decode threads corpus_revision; mismatch surfaces as
stale_cursor anyhow error. App::search retained as thin wrapper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 17:59:48 +09:00
th-kim0823
aabe66f5e2 docs(error_wire): note stale_cursor convention (fb-34)
stale_cursor is built by cursor::decode, not classify. Test
locks the invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 17:50:39 +09:00
th-kim0823
ebbc3a46ae feat(app): cursor encode/decode for paginated search (fb-34)
Opaque base64(JSON{offset, corpus_revision}). Mismatch or
malformed input returns ErrorV1 with code = stale_cursor.
base64 promoted to workspace dep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 17:49:23 +09:00
th-kim0823
e00418537f feat(core): SearchOpts domain type for budget controls (fb-34)
3 optional knobs (max_tokens, snippet_chars, cursor); Default = all
None = no enforcement (backwards-compat existing search behavior).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 17:46:40 +09:00
th-kim0823
a082b78f8e fix(fb-33): address PR #124 round 1 review
- pipeline: refresh module docstring step 5 to reflect new cancel
  semantics (RetrievalDone/Token/Final + LlmStreamAborted)
- wire schema: spell out refusal-path behavior in answer_event.v1
  description (only retrieval_done emitted; no final)
- test: factual comment on relax_score_gate-using test corrected
- test: new Ollama-gated stream_score_gate_refusal_emits_only_retrieval_done
- test: new ask_emits_no_final_when_cancelled_mid_stream pinning
  the no-Final invariant on cancel
- pipeline: large_enum_variant comment broadened to acknowledge
  RetrievalDone.hits as the dominant per-emit cost
- HOTFIXES: log AskOpts.stream_sink internal API break per spec
  contract policy

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 15:46:04 +09:00
th-kim0823
39bf0de949 test(cli): wire_ask_stream — stderr ndjson + stdout final + BrokenPipe cancel (fb-33)
Three Ollama-gated integration tests covering:
- stderr lines parse as answer_event.v1 (retrieval_done first,
  final last, all carry RFC3339 ts).
- stdout final line is answer.v1 (backwards compat).
- non-stream path (--json without --stream) unchanged.
- BrokenPipe stderr → child terminates cleanly via cancel
  propagation through pipeline SendError.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 15:14:00 +09:00
th-kim0823
29629e6786 feat(cli): kebab ask --stream emits ndjson on stderr (fb-33)
Background-thread driver runs ask_with_config; main thread
drains the receiver, serializes each StreamEvent to ndjson on
stderr. BrokenPipe → drop receiver → pipeline SendError →
cancel + LlmStreamAborted refusal. Final stdout line is the
existing answer.v1 (ingest_progress.v1 backwards-compat
pattern).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 15:03:41 +09:00
th-kim0823
e5c99f5b80 feat(tui): adapt ask worker to StreamEvent sink (fb-33)
Worker channel now carries kebab_app::StreamEvent. drain_stream
matches on Token { delta }; RetrievalDone and Final are ignored
(citations render from last_answer, Final is redundant with
worker join). app::AskState.rx type widened to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 14:57:46 +09:00
th-kim0823
307fd8d527 feat(rag): pipeline emits StreamEvent + cancel on SendError (fb-33)
RetrievalDone after retrieve+stale-stamp, Token per LM chunk
(SendError → break, FinishReason::Cancelled, RefusalReason::
LlmStreamAborted), Final on success. answers row still persists
on cancel for audit. Adds FinishReason::Cancelled, re-exports
StreamEvent from kebab_rag, migrates two pre-fb-33 sink tests
in tests/pipeline.rs to the new StreamEvent type (the
"dropped receiver does not abort" test inverts to record cancel).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 14:49:55 +09:00
th-kim0823
31475f0312 feat(rag): StreamEvent enum + switch AskOpts.stream_sink (fb-33)
3-variant discriminated enum (RetrievalDone / Token / Final).
AskOpts.stream_sink now carries StreamEvent. Other crates fail
to compile until subsequent tasks adapt their call sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 14:38:54 +09:00
th-kim0823
cb04bd8c8d fix(fb-32): address PR #122 round 2 review
- spec: add one-line cross-link to HOTFIXES entry per CLAUDE.md
  Spec-contract policy
- HOTFIXES: rename heading from "fb-32" to "p9-fb-32" matching
  the rest of the file's full-ID convention
- config: defensive assert before string-replace in negative TOML
  test guards against default-value drift causing unhelpful unwrap

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:19:19 +09:00
th-kim0823
efc6b7ebb0 fix(fb-32): address PR #122 round 1 review
- config: rename env-silent-ignore test + add file-load negative test
  asserting ConfigInvalid for negative TOML stale_threshold_days
- rag: add 5 boundary unit tests pinning compute_stale mirror equivalence
- search: rewrite "Task 6" plan refs in lexical/vector to point at
  actual function names (mark_stale_in_place / RagPipeline::ask)
- cli: dedupe write_config / ingest / backdate_updated_at helpers
  from wire_search_stale + wire_ask_stale into tests/common/mod.rs
- tui: clarify inspect.rs uses same source-of-truth as SearchHit
- rag: PackedCitation.stale invariant doc comment
- HOTFIXES: log conscious decision on wire-schema required-field
  expansion (strict-validator concern)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:04:28 +09:00
th-kim0823
1f39b6bc2c feat(tui): [STALE] Warning-styled badge on search/inspect/ask (fb-32)
insta filter pattern '[indexed_at]' applied where snapshots
otherwise capture time-dependent RFC3339 strings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 02:43:05 +09:00
th-kim0823
aeee7ed771 feat(cli): [stale] tag on plain ask citations (fb-32)
Mirror of Task 9's search-output rendering: yellow [stale] on TTY,
plain text otherwise. JSON path inherits via serde on AnswerCitation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 02:34:58 +09:00
th-kim0823
15cdc97cae feat(cli): [stale] tag on plain search output (fb-32)
Yellow when TTY, plain when not. JSON path inherits via serde
on the domain type; no CLI-side wire change needed there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 02:24:54 +09:00
th-kim0823
cc41adabb5 feat(wire): search_hit.v1 + citation.v1 require indexed_at + stale (fb-32)
Additive minor — schema_version unchanged. Existing v1 consumers
that ignore unknown fields stay compatible; consumers that validate
strictly will reject pre-fb-32 payloads, which matches the wire
contract escape hatch (recipient version >= producer required).

Cross-task placeholders: kebab-eval / kebab-tui synthetic test
fixtures pin UNIX_EPOCH + stale=false (same pattern as
hybrid.rs / vector.rs). These don't exercise staleness — Task 11
adds dedicated TUI staleness rendering tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 02:17:15 +09:00
th-kim0823
16db60f7bd docs(fb-32): mirror back-reference + fix pipeline doc-comment ordering
- kebab-app::staleness::compute_stale gains note pointing at the
  kebab-rag mirror so future modifiers know to update both copies.
- kebab-rag::pipeline: doc comments adjacent to compute_stale and
  embedding_ref_for were positioned such that rustdoc would
  misattribute them. Reorder/separate so each comment hugs its
  own function.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:51:40 +09:00
th-kim0823
e398272a24 feat(rag): AnswerCitation inherits indexed_at + stale from hit (fb-32)
pack_context widened to carry indexed_at + stale alongside marker
and Citation. LLM-citation construction site now plumbs real values
from upstream SearchHit instead of the Task 6 UNIX_EPOCH placeholder.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:44:24 +09:00
th-kim0823
e891e487cf test(rag): mk_hit gains indexed_at + stale stubs (fb-32)
Test helper missed the SearchHit field expansion from fb-32 Task 1.
UNIX_EPOCH + false placeholders consistent with the cross-crate
synthetic-mock pattern (hybrid.rs, vector.rs build_hit Task 4 stub).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:37:19 +09:00
th-kim0823
dfef65f196 feat(app): staleness module + post-process search hits (fb-32)
compute_stale: strict > boundary, threshold=0 disables, future
timestamps treated as fresh (clock skew safety). App::search
re-stamps on cache hit so config threshold changes take effect
without flushing the cache.

Also unblocks the workspace build by plugging placeholder
indexed_at/stale into the two AnswerCitation construction
sites in kebab-rag/pipeline.rs (the score-gate refusal path
forwards from SearchHit; the LLM-citation path uses
UNIX_EPOCH/false until Task 7 wires the real values through
pack_context).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:30:10 +09:00
th-kim0823
8faad2f407 feat(search/vector): populate SearchHit.indexed_at (fb-32)
hydrate_chunks now JOINs d.updated_at. Hybrid fusion path is
unchanged (passes SearchHit through, fields preserved).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:17:54 +09:00
th-kim0823
f4ce6652b2 feat(search/lexical): populate SearchHit.indexed_at (fb-32)
JOIN documents.updated_at. stale defaults to false; App facade
post-processes against config threshold.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:10:20 +09:00