kebab

Author	SHA1	Message	Date
altair823	eec90996aa	chore: bump version 0.11.0 → 0.11.1 dogfood semantic cleanup (PR #150) lands: document-centric fetch_span + assets.workspace_path 'last-registered' semantic explicitly documented. patch bump 사유: 외부 wire / CLI / config surface 변경 없음. 새 internal trait method (get_asset) + caller refactor + doc-comment 갱신. twin file 의 fetch_span 잘못 분기 가능성 fix (rare). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> v0.11.1	2026-05-20 08:09:46 +00:00
altair823	ce1c778b4a	Merge pull request 'fix(dogfood): document-centric fetch_span + assets.workspace_path semantic doc' (#150 ) from fix/dogfood-asset-flip-flop-cleanup into main	2026-05-20 08:08:55 +00:00
altair823	453ec15df4	fix(dogfood): document-centric fetch_span + remove get_asset_by_workspace_path assets.workspace_path is INTENTIONALLY 'last-registered path' for twin files (identical content at different paths share one asset row PK'd by blake3 content hash). PR #146 made try_skip_unchanged document-centric; PR #149 made reset --orphans-only document-centric; this PR removes the last caller of get_asset_by_workspace_path (fetch.rs:193 in fetch_span, which used it to reject PDF/audio media — for twins this could read the wrong asset's media_type and pick the wrong branch). Replaced with the natural 2-step lookup: get_document_by_workspace_path (PR #146) → doc.source_asset_id → get_asset (NEW trait method, asset_id is PRIMARY KEY so flip-flop-immune by construction). Then removed get_asset_by_workspace_path trait method + SqliteStore impl — 0 callers after the refactor. UPSERT doc-comment refreshed in store.rs to make the 'last-registered' semantics explicit so future readers don't try to 'fix' the flip-flop. Dogfood follow-up (PR #142 1B + multi-root corpus). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 08:03:38 +00:00
altair823	1e6de9fe9f	chore: bump version 0.10.0 → 0.11.0 dogfood follow-up (PR #149) lands: kebab reset --orphans-only explicit complement to PR #148's conservative sweep. minor bump 사유: 새 CLI flag (--orphans-only) + 새 ResetScope variant + ResetReport additive 필드 = surface 확장. design §10.4 트리거 충족. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> v0.11.0	2026-05-20 07:53:55 +00:00
altair823	9fa2a1ebac	Merge pull request 'feat(dogfood): kebab reset --orphans-only — explicit complement to PR #148 sweep' (#149 ) from feat/dogfood-reset-orphans-only into main	2026-05-20 07:50:43 +00:00
altair823	749c6ae240	docs(dogfood): sync reset_report schema + README for --orphans-only (PR #149 review) Round 1 review found 2 doc gaps: - docs/wire-schema/v1/reset_report.schema.json: 'orphans_only' missing from scope enum; orphans_purged/purged_paths properties absent - README: --orphans-only not listed in the reset prose Schema additions are additive minor (default values keep back-compat). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 07:47:44 +00:00
altair823	5f2bd9e97e	feat(dogfood): kebab reset --orphans-only — purge stored docs outside walker scope PR #148 auto-purges only filesystem-missing files (conservative — leaves on-disk-but-out-of-scope docs alone for data safety). This is the explicit complement: when the user has narrowed include / widened exclude / removed a sub-directory from the workspace and WANTS the stored docs reconciled, they invoke 'kebab reset --orphans-only'. Confirm prompt with orphan count + sample paths; --yes required in non-TTY. SQLite purge via existing purge_deleted_workspace_path (PR #148) + vector store delete_by_chunk_ids when configured. No fs existence check — orphans-only is the explicit 'I know what I'm doing' variant. dogfood follow-up to PR #148 (file deletion auto-purge). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 07:38:10 +00:00
altair823	1ce06c1e2d	chore: bump version 0.9.0 → 0.10.0 dogfood-discovered file-deletion auto-purge (PR #148) lands. minor bump 사유: additive wire field IngestReport.purged_deleted_files + 새 CLI summary surface (purged N) + 새 사용자-가시 동작 (rm a.md 후 ingest 시 자동 정리). design §10.4 도그푸딩-ready surface 확장 트리거. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> v0.10.0	2026-05-20 07:12:58 +00:00
altair823	d26efe167f	Merge pull request 'fix(dogfood): auto-purge stored docs for filesystem-deleted files' (#148 ) from fix/dogfood-file-deletion-auto-purge into main	2026-05-20 07:10:33 +00:00
altair823	d6d165df01	docs(dogfood): sync sweep_deleted_files algorithm doc with try_exists (PR #148 nit) Round 2 review found the function-level doc-comment still referenced the old fs::exists() (now replaced by try_exists().unwrap_or(true) in commit `2baa846`). One-line clarification — describes the conservative-on-Err semantics so future readers don't reintroduce the data-safety bug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 07:10:27 +00:00
altair823	2baa846c6b	fix(dogfood): conservative try_exists() in sweep_deleted_files (PR #148 review) Round 1 review found a data-safety bug: fs::exists() returns false on errors like EACCES / EPERM / NFS-hiccup / ownership-change, which would trigger purge on a file that is in fact still on disk (just unreadable this moment). Switched to try_exists().unwrap_or(true) so transient FS errors are CONSERVATIVELY treated as 'file present' — never purge on uncertain signal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 07:04:03 +00:00
altair823	27baec82ea	fix(dogfood): auto-purge stored docs for filesystem-deleted files Files deleted from disk (rm a.md) were leaving stale documents + chunks + embeddings in the store, surfacing as ghost citations in search/ask. Existing purge_orphan_at_workspace_path only handled content-changed stale (WHERE workspace_path=? AND asset_id != ?) — file deletion has no new asset_id. Fix: post-walker-scan sweep. Compute (stored_paths - scanned_paths), for each candidate check filesystem existence — only purge when the file is TRULY missing. Scope-narrowing case (file on disk but outside include glob) is explicitly NOT purged to protect users from accidental data loss via config edits. Adds: - DocumentStore::all_workspace_paths trait method + SqliteStore impl - purge_deleted_workspace_path in store-sqlite (returns chunk_ids for vector delete; deletes doc CASCADE + asset row + copied storage file) - sweep_deleted_files in kebab-app::ingest path; called once per ingest before the per-asset loop - IngestReport.purged_deleted_files counter (additive, serde default) - CLI ingest summary mentions purge count when > 0 - 2 integration tests: file_deletion_auto_purge + include_scope_narrowing_does_NOT_purge dogfood discovery (PR #142 1B + multi-root: kebab-docs + httpx + zod + lodash). Per user decision: only filesystem deletion auto-purges; scope narrowing requires explicit kebab reset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 06:51:07 +00:00
altair823	acf8cf3be2	chore: bump version 0.8.3 → 0.9.0 dogfood-discovered routing additions (PR #147) land: - .mts / .cts → MediaType::Code(typescript) - .mdx → MediaType::Markdown minor bump 사유: 사용자 도그푸딩 surface 확장 — 이전에 skip 되던 28+ 파일이 이제 색인됨. design §10.4 dogfooding-ready surface 확장 = minor trigger. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> v0.9.0	2026-05-20 06:29:27 +00:00
altair823	ea5f7b22c8	Merge pull request 'feat(dogfood): route .mts/.cts → typescript + .mdx → markdown' (#147 ) from feat/dogfood-routing-cts-mts-mdx into main	2026-05-20 06:28:41 +00:00
altair823	5497c6e7b5	feat(dogfood): route .mts/.cts to typescript + .mdx to markdown Dogfood (PR #142 1B + multi-root: kebab-docs + httpx + zod + lodash) showed 28 files skipped by extension that are routable to existing extractors: - .mts (ESM TypeScript) / .cts (CommonJS TypeScript) — same grammar as .ts in tree-sitter-typescript 0.23 (LANGUAGE_TYPESCRIPT covers JSX- agnostic variants; LANGUAGE_TSX stays for .tsx only) - .mdx (Markdown + JSX) — routed as MediaType::Markdown; the md parser folds JSX islands through as raw passthrough Changes: - crates/kebab-source-fs/src/media.rs: 'mts'\|'cts' → Code(typescript), 'mdx' → Markdown. +2 unit tests. - crates/kebab-parse-code/src/lang.rs: code_lang_for_path matches mts/cts; module_path_for_tsjs strips .mts/.cts as well. Test cases extended. - crates/kebab-parse-code/src/typescript.rs: doc comment on select_grammar refreshed to mention .mts/.cts. - crates/kebab-parse-code/tests/lang.rs: 2 new assertions. verify: kebab-source-fs 44 / kebab-parse-code lib 20 + lang 4 all pass; clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 06:24:21 +00:00
altair823	5a90940f1c	chore: bump version 0.8.2 → 0.8.3 dogfood-discovered fix (PR #146) lands: idempotent re-ingest now correctly returns Unchanged for twin files (identical content at different paths) via document-centric try_skip_unchanged lookup. patch bump 사유: advertised idempotency 의 정상 동작 복원. 새 wire / config / surface 변경 없음. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> v0.8.3	2026-05-20 06:20:34 +00:00
altair823	4389b887f0	Merge pull request 'fix(dogfood): document-centric try_skip_unchanged for twin-file idempotency' (#146 ) from fix/dogfood-bug4-idempotent-twin-files into main	2026-05-20 06:16:28 +00:00
altair823	360f825f3a	docs(dogfood): refresh try_skip_unchanged doc-comment to match new flow (PR #146 review) Round 1 review found the function-level doc-comment still described the old asset-side algorithm (item 2 asset-row checksum, item 3 id_for_doc miss). Updated to the document-centric flow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 05:35:17 +00:00
altair823	641b92af7d	fix(dogfood): document-centric try_skip_unchanged for twin-file idempotency Identical-content files at different workspace paths share one assets row (assets.asset_id = blake3 content hash, PRIMARY KEY). The UPSERT `ON CONFLICT(asset_id) DO UPDATE SET workspace_path = excluded` made twin files overwrite each other's workspace_path on every ingest, so `get_asset_by_workspace_path(path1)` returned the OTHER twin's row (or None) — break idempotent unchanged-detection for both files. Fix: switch try_skip_unchanged to document-centric lookup. `documents. workspace_path` is already UNIQUE (V001) and `id_for_doc(path, ...)` includes path, so each twin has its own stable document row. Compare `doc.source_asset_id` with the new asset's checksum instead of going through the assets table. Dogfood (multi-root: kebab-docs + httpx + zod + lodash) showed 27 of 726 docs marked Updated on every idempotent re-ingest — all 27 are twin-file victims (empty `__init__.py` ×3, AGENTS.md ↔ CLAUDE.md same content, duplicate logo PDFs/JPGs). After: re-ingest reports 0 new / 0 updated / 726 unchanged. No schema migration needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 05:27:21 +00:00
altair823	08fb743598	chore: bump version 0.8.1 → 0.8.2 dogfood-discovered fixes (PR #145) land in production: - schema.v1.repo_breakdown 가 실제로 채워짐 (이전: 항상 빈 BTreeMap) - workspace.include glob 가 walker 에서 enforce 됨 (이전: 완전 무시) patch bump 사유: 둘 다 advertised surface 의 정상 동작 복원. 새 wire / config / surface 변경 없음. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> v0.8.2	2026-05-20 05:20:48 +00:00
altair823	0a2a7ae214	Merge pull request 'fix(dogfood): schema.repo_breakdown + workspace.include walker enforcement (dogfood-discovered)' (#145 ) from fix/dogfood-bugs-schema-walker-incremental into main	2026-05-20 05:18:59 +00:00
altair823	803d02b68b	fix(dogfood): enforce workspace.include in walker (allow-list semantics) config.workspace.include was completely ignored by the walker — connector.rs log_scope_include_warning literally said "handled by extractor router" but no extractor router exists. Dogfooding (PR #142 1B + multi-root corpus kebab-docs + httpx + zod + lodash) showed user-set include of code+md still ingested 84 .png + 8 .pdf files. Fix: walker treats scope.include as an allow-list — empty Vec preserves backward-compat (all files pass), non-empty requires file path to match at least one pattern (AND with the existing exclude rules). Removed the misleading debug log. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 05:15:04 +00:00
altair823	4e8b84c4e0	fix(dogfood): populate schema.v1.repo_breakdown (Task 9 follow-up) Dogfooding (PR #142 1B + multi-root corpus: kebab-docs + httpx + zod + lodash) revealed schema.v1.repo_breakdown is always {} despite the 1A-2 Task 9 having added the code_lang_breakdown sibling. The schema.rs:171 placeholder `BTreeMap::new()` was left in place. Mirror Task 9's code_lang_breakdown query for the repo field — same metadata_json JSON-path pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 05:09:19 +00:00
altair823	16dc02cfa2	chore: bump version 0.8.0 → 0.8.1 dogfood-discovered code_lang/repo filter bug (PR #144) fix lands in production. patch bump because: - 1A-1 advertised CLI flags --code-lang / --repo were live but inert (SearchFilters fields propagated but never applied to retriever SQL) - fix restores intended behavior; no new wire surface - user has dogfooded against httpx + zod + lodash and re-validating needs the fixed binary Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> v0.8.1	2026-05-20 03:35:36 +00:00
altair823	74f1b0571b	Merge pull request 'fix(p10-1a-1): apply code_lang + repo filters in lexical SQL and filter_chunks (dogfood)' (#144 ) from fix/p10-1a-1-code-lang-repo-filter-sql into main	2026-05-20 03:34:53 +00:00
altair823	918ee6c0be	fix(p10-1a-1): apply code_lang + repo filters in lexical SQL and filter_chunks (dogfood-discovered) p10-1A-1 (PR #139) added SearchFilters.code_lang + .repo fields and the CLI --code-lang / --repo flags propagate them correctly into SearchFilters, but neither the lexical retriever's FTS SQL nor the shared filter_chunks helper (used by the vector retriever) ever applied them — so a code-lang-filtered search returned all-doc hits (markdown / pdf / code mixed). Discovered while dogfooding p10-1B with httpx + zod + lodash clones: `kebab search 'AsyncClient' --code-lang python --json` returned markdown hits from httpx/docs/ first. Fix: add IN-list filters on json_extract(d.metadata_json, '$.code_lang') and '$.repo' to both lexical.rs and filters.rs, mirroring the existing media filter pattern. Two regression tests added in each crate covering the new filter behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 03:27:01 +00:00
altair823	68ada396f3	Merge pull request 'fix(p10-1b): apply round-1 lang.rs doc + tests/ test case missed in 4503b5b' (#143 ) from fix/p10-1b-lang-doc-test-staging-miss into main v0.8.0	2026-05-20 02:31:13 +00:00
altair823	23c4ad97b9	fix(p10-1b): apply round-1 lang.rs doc + tests/ test case missed in `4503b5b` PR #142 round-1 fix commit `4503b5b` 보고에는 lang.rs 의 (a) module_path_for_python doc comment 갱신 (tests/examples/benches 가 의도적으로 strip 안 됨 명시) 과 (b) tests/test_foo.py → tests.test_foo 단언 추가가 포함됐다고 적혔으나, 실제 commit 에는 lang.rs 변경이 staging 되지 않아 main 에 안 들어감 (review loop round 2 이 working tree 상태만 신뢰하고 commit 검증을 안 함). 이번 PR 이 누락된 (5)+(6) 항목만 retro 적용. lang.rs +9 lines (test 1 + doc 4 + 주석 2 + 빈줄 2). cargo test -p kebab-parse-code --lib → 20/20 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 02:28:53 +00:00
altair823	1f566b8bfa	Merge pull request 'feat(p10-1B): Python + TS/JS AST chunkers — tree-sitter-{python,typescript,javascript} 코드 색인 활성화' (#142 ) from feat/p10-1b-py-ts-js into main	2026-05-20 02:26:24 +00:00
altair823	26562588e3	fix(p10-1b): PR review round 2 — fold TS class-method decorators into unit line range Round 1 push-back on TS/JS class-method decorator handling was based on an inaccurate doc comment in typescript.rs that claimed decorators are method_definition children; tree-sitter-typescript 0.23 actually places them as class_body preceding siblings. Round 2 correctly identified the cross-language inconsistency with Python's decorated_definition arm. Fix: extend unit_start backward walk in typescript.rs to also accept 'decorator' siblings (three-line change + corrected doc comment). javascript.rs is unaffected: tree-sitter-javascript stores the decorator as a named child INSIDE method_definition, so method_definition.start_row already covers the decorator line without any sibling walk. Adds three regression tests: - class_method_decorator_folded_into_method_unit (TS): asserts @Log() is inside the emitted method unit code and line_start == 2. - ts_class_decorator_folded_into_class_unit (TS): class-level @Injectable() folded into the class unit, line_start == 1. - js_class_method_decorator_already_folded_by_grammar (JS): documents that JS already includes the decorator via grammar semantics. verify: per-crate cargo test (20 passed) + clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 02:20:22 +00:00
altair823	4503b5b12f	fix(p10-1b): PR review round 1 — 5 actionable items (1) tasks/HOTFIXES.md: add 2026-05-20 entry for path-sanitize gap in module_path_for_python / _tsjs (promised in task spec line 55 but not landed in round 0). Bidirectional cross-link added. (2) crates/kebab-parse-code: dedup filename_from_workspace_path / strip_extension / join_symbol via new pub(crate) module scaffold.rs. Removed 9 byte-identical fn copies across rust/python/typescript/ javascript extractors. Pure refactor — no behavior change. (3) crates/kebab-parse-code/tests/fixtures/sample.py: @staticmethod was semantically inappropriate on a module-level fn (class-method decorator). Changed to @no_type_check; test assertion updated. (5)+(6) crates/kebab-parse-code/src/lang.rs: add tests/test_foo.py case to module_path_for_python test + doc clarifying that tests/ / examples/ / benches/ are intentionally not stripped. (4) PUSH BACK — TS/JS class decorator handling is design intent of 1B 1차 (typescript.rs:242-244 + HOTFIXES entry 2 already in place). No code change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 02:03:52 +00:00
altair823	44813df052	docs(p10-1b): README/HANDOFF/ARCHITECTURE/SMOKE/INDEX + HOTFIXES; chore: bump version 0.7.0 → 0.8.0 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 01:48:06 +00:00
altair823	d6bb6cfd3b	test(p10-1b): per-language chunker snapshots (python/ts/js) Mirrors code_rust_ast_snapshot pattern. In-memory CanonicalDocument build so no kebab-parse-code dep (boundary §6.3 respected). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 01:39:17 +00:00
altair823	d53995a6d4	feat(p10-1b): code-js-ast-v1 chunker + activate JavaScript in app dispatch Chunker: duplicate-with-substitution from code-ts-ast-v1 / code-rust-ast-v1. Dispatch: replaces JS bail! arms with JavascriptAstExtractor + CodeJsAstV1Chunker. Integration test javascript_file_ingests_and_searches_as_code_citation asserts citation.lang=javascript, symbol=src/Bar.Bar.baz, code_lang=javascript. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 01:16:07 +00:00
altair823	c215034653	feat(p10-1b): tree-sitter-javascript AST extractor (JS + JSX) Single-grammar variant of typescript.rs — JS handles .jsx via the same LanguageFn. No interface/type/enum arms; otherwise identical mapping + workspace-path prefix via module_path_for_tsjs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 01:09:22 +00:00
altair823	31245a4328	fix(p10-1b): TS parser_version code-typescript-v1 → code-ts-v1 (naming consistency) Task H implementer chose code-typescript-v1 but plan + design §3.3 use the short form (chunker is code-ts-ast-v1 / code-js-ast-v1). Aligning parser versions to match: rust=code-rust-v1 / python=code-python-v1 / ts=code-ts-v1 / js=code-js-v1 (Task K). Fixes 2 sites: const PARSER_VERSION + integration test assertion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 01:05:17 +00:00
altair823	acb61b6830	feat(p10-1b): activate TypeScript in ingest_one_code_asset dispatch Replaces TS bail! arms with TypescriptAstExtractor + CodeTsAstV1Chunker. Adds typescript_file_ingests_and_searches_as_code_citation integration test — asserts citation.lang=typescript, symbol=src/Foo.Foo.bar, code_lang=typescript. JS arms remain bail!() (Task L). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:59:41 +00:00
altair823	20feb3133e	feat(p10-1b): code-ts-ast-v1 chunker (1:1 + oversize split) Duplicate of code-rust-ast-v1 / code-python-ast-v1 with language-agnostic body unchanged. Cross-chunker policy_hash identity asserted vs md-heading-v1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:56:41 +00:00
altair823	de63f161ac	feat(p10-1b): tree-sitter-typescript AST extractor (TS + TSX via grammar selection) Adds `kebab_parse_code::typescript::TypescriptAstExtractor` (PARSER_VERSION `code-typescript-v1`), mirroring the Python extractor (P10-1B Task E) and the Rust scaffold (P10-1A-2). One `Block::Code` per top-level AST semantic unit (free fn / class / each method / interface / type alias / enum, recursively per nested class), each carrying `SourceSpan::Code` with the unit's dotted symbol path prefixed by `module_path_for_tsjs`. Grammar selection per `tree-sitter-typescript` 0.23: the workspace path's `.tsx` extension routes to `LANGUAGE_TSX`, everything else to `LANGUAGE_TYPESCRIPT`. The `export_statement` arm unwraps a `declaration` field (`function_declaration` / `class_declaration` / `interface_declaration` / `type_alias_declaration` / `enum_declaration`) using the OUTER statement's line range so `export ` is folded in; for `export default function () {}` and `export default class {}` (where the inner node sits under the `value` field as `function_expression` / `class` with no `name`), the symbol leaf is `default`. Bare value exports / re-exports fall into glue. Glue grouping reuses the Python post-pass: `<module>` only when the entire group is imports + bare re-exports; demoted to `<top-level>` if the file produced any real unit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:54:27 +00:00
altair823	1815091247	feat(p10-1b): activate Python in ingest_one_code_asset dispatch Replaces Python bail! arms with PythonAstExtractor + CodePythonAstV1Chunker. Adds python_file_ingests_and_searches_as_code_citation integration test — asserts citation.lang=python, symbol=kebab_eval.metrics.compute_mrr, code_lang=python. TS/JS arms remain bail!() (Tasks J/L). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:49:01 +00:00
altair823	6a0b340941	feat(p10-1b): code-python-ast-v1 chunker (1:1 + oversize split) Duplicate of code-rust-ast-v1 with language-agnostic body unchanged. Cross-chunker policy_hash identity asserted vs md-heading-v1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:46:17 +00:00
altair823	9664e97497	feat(p10-1b): tree-sitter-python AST extractor (PythonAstExtractor) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:41:35 +00:00
altair823	8bdb3e8090	refactor(p10-1b): generalize ingest_one_code_asset for multi-language dispatch Rust path observably unchanged (verified by existing code_ingest_smoke tests). Python/TS/JS arms bail with TODO; per-lang extractor + chunker land in subsequent tasks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:35:53 +00:00
altair823	dcad9ccda2	feat(p10-1b): module_path_for_python / _tsjs helpers (workspace path → module prefix) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:31:33 +00:00
altair823	ed0f4769b3	feat(p10-1b): route .py/.pyi/.ts/.tsx/.js/.mjs/.cjs/.jsx to MediaType::Code Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:30:07 +00:00
altair823	0c61758931	build(p10-1b): add tree-sitter-python/-typescript/-javascript workspace deps Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:28:31 +00:00
altair823	39b766ea59	docs(p10-1b): task spec + implementation plan Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:26:58 +00:00
altair823	7f287abacb	Merge pull request 'test(eval): normalize elapsed_ms before determinism comparison (flake fix)' (#141 ) from fix/eval-runner-timing-flake into main	2026-05-20 00:08:40 +00:00
altair823	d715631928	test(eval): normalize elapsed_ms before determinism comparison (flake fix) `runner_lexical_is_deterministic_per_query_payload` 가 full-suite 첫 실행에서 간헐적으로 `elapsed_ms: 0` vs `elapsed_ms: 1` 차이로 깨지는 timing flake 가 있었음 (PR #140 회차 0 의 full-suite 실행에서 관찰). 원인: per_query 전체 JSON 을 byte-identical 비교하는데 QueryResult.elapsed_ms 가 timing 기반이라 µs-scale wall-clock jitter 가 그대로 비교에 들어감. 의도는 "timing 외에 byte-identical" — 인접 snapshot test #7 은 projection 으로 timing 을 명시적으로 제외하지만 #6 은 누락. Fix: 비교 직전 양쪽 run 의 elapsed_ms 를 0 으로 normalize. 의도 그대로 표현하고 다른 field 의 결정성 검증은 보존. 50회 반복 stress 통과 (이전: 간헐 실패). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 00:01:41 +00:00
altair823	73e5b359d8	Merge pull request 'feat(p10-1A-2): Rust AST chunker — tree-sitter-rust 코드 색인 활성화' (#140 ) from feat/p10-1a-2-rust-ast-chunker into main v0.7.0	2026-05-19 23:40:15 +00:00

1 2 3 4 5 ...

720 Commits