fix(fb-35): address PR #126 round 2 review

- wire schema: relax effective_end.minimum 1 → 0 + expand
  description to cover line-clamp + out-of-range sentinel
  (panic-fix R1 emits Some(0) when line_start=1 and range is
  beyond doc end — schema must accept it)
- tests: tighten first-chunk-target boundary test to assert ≤ 2
  total neighbors (3-chunk doc, N=2). Strict "first chunk →
  context_before empty" not assertable until chunks.ordinal
  column lands (R1 #9 architectural caveat)
- store: trim contradiction in list_chunk_ids_for_doc warning
  comment — drop "good enough for sequentially chunked
  markdown" phrase that conflicts with "hash sort dominates"
  paragraph above

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
th-kim0823
2026-05-10 00:55:29 +09:00
parent 7dddc1d706
commit b86b763dfb
3 changed files with 20 additions and 9 deletions

View File

@@ -389,10 +389,11 @@ impl SqliteStore {
///
/// Real fix is a `chunks.ordinal` column (V007 migration) or sort
/// by `chunks.source_spans_json[0]` start offset. Tracked as
/// follow-up; current behavior is good enough for sequentially
/// chunked markdown where created_at uniqueness varies, but PDFs
/// (page-aligned chunks) and large docs may surprise the agent.
/// See `tasks/HOTFIXES.md` if/when this is escalated.
/// follow-up. Until then `--context` neighbors are best-effort —
/// they may or may not align with document position depending on
/// whether `chunk_id` hash order happens to match insertion order
/// for that particular doc. Large markdown / PDF (page-aligned
/// chunks) likely re-orders. See `tasks/HOTFIXES.md` if escalated.
pub fn list_chunk_ids_for_doc(
&self,
doc_id: &kebab_core::DocumentId,