kebab

Author	SHA1	Message	Date
altair823	e03d03cb26	test: 별칭 전용 테스트 삭제 + 영향 테스트/fixture 갱신 kebab-search/tests/lexical.rs 의 alias 채널 테스트 + insert_chunk_with_aliases 헬퍼 제거(body 회수 회귀 테스트로 대체). Chunk 리터럴 aliases: None 제거 (embedding_records_fk/idempotency/inspect). chunk 스냅샷 fixture 의 aliases 키 제거. config_migrate 는 ingest.code 앵커로, corpus_revision/search_lexical 주석은 V013 비-bump 명시로 갱신. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 21:37:58 +00:00
altair823	2619b7bff7	test(chunk): AST snapshot fixture에 aliases:null 필드 반영 Chunk 구조체에 aliases 필드가 추가된(별칭 인프라) 뒤 chunk-*-ast-v1 snapshot fixture 들이 미갱신 상태로 남아 drift FAIL 이었다. chunk_id· text·policy_hash·tokenized 는 전부 불변 — 직렬화에 "aliases": null 한 필드만 추가됐다(청크 생성 로직 무변경, 회귀 아님). UPDATE_SNAPSHOTS=1 로 10개 fixture(code c/cpp/go/java/js/kotlin/python/rust/ts + long_section) 재베이크. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 09:57:16 +00:00
altair823	53ec9b4dc5	test(chunk): regenerate AST + long-section snapshots for V009 chunk field S3 의 Chunk struct 갱신 (kebab-core 의 tokenized_korean_text: Option<String> field 추가) 가 모든 chunk snapshot JSON 의 serde serialize 결과를 변경시킴. 10 snapshot fixture (9 AST chunker + markdown long-section) 의 baseline 을 V009 형태로 regenerate. 각 snapshot 의 변경 = chunk JSON 마다 `"tokenized_korean_text": null` field 추가 (대부분의 fixture 가 영어 코드라 lindera 의 None fallback). 동작 변경 없음 — serde representation 의 cascade만. Spec: docs/superpowers/specs/2026-05-28-v0.20.x-korean-morphological-tokenizer-spec.md §6.2 Plan: docs/superpowers/plans/2026-05-28-v0.20.x-korean-morphological-tokenizer-plan.md (S3 follow-up via S11 sanity)	2026-05-28 12:27:37 +00:00
altair823	b2a2902e38	feat(p10-1d): code-cpp-ast-v1 chunker + snapshot test Identical chunker body to code-c-ast-v1 (per-language work happens in the CppAstExtractor, Task C). Snapshot fixture covers nested namespace + class + ctor/dtor + method + operator overload + template fn + free fn + top-level main, verifying namespace::Class::method symbol convention per design §3.4. 5 chunks emitted: - <top-level> (includes, namespace opening) - kebab::chunk::MdHeadingV1Chunker (class unit) - kebab::identity (template function) - kebab::global_helper (free function in namespace) - main (top-level main function) Template function symbols emit without <T> parameters per spec convention. Namespace::Class::method pattern verified. All tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 13:46:12 +00:00
altair823	03cd41c48f	feat(p10-1d): code-c-ast-v1 chunker + snapshot test Mirrors code-go-ast-v1's chunker pattern. Snapshot test against tests/fixtures/sample.c (function + typedef struct + typedef enum + preprocessor) verifies symbol list + lang=c stamping. Chunks produced (4 total): - <top-level> glue: includes, defines, static vars, typedefs (lines 1-18) - parse_record function (lines 20-23) - print_record function (lines 25-27) - main function (lines 29-33) All chunks stamped with lang=c and chunker_version=code-c-ast-v1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 13:41:19 +00:00
altair823	0b7d8af759	feat(p10-3): code-text-paragraph-v1 chunker — paragraph + line-window fallback Blank-line paragraph segmentation (whitespace-only lines as boundaries, blank lines themselves never in any chunk's range). Paragraphs > 80 lines split into 80-line windows with 20-line overlap (stride 60), sharing the input lang and symbol=None per spec §9.3. tier2_shared exposes a new build_chunk_no_symbol helper so Chunk id/hash/token semantics stay identical with Tier 1/2. Extracts build_chunk_from_span as private core so build_chunk and build_chunk_no_symbol share mechanics without drift. 4 unit tests cover multi-paragraph shell (4 paragraphs, blank-line boundaries verified), 200-line oversize line-window split (chunks 1-80 / 61-140 / 121-200), empty file, and lang preservation when input is yaml. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 11:22:48 +00:00
altair823	22d4161728	feat(p10-2): manifest-file-v1 chunker (whole-file 1 chunk, symbol <manifest>) Emits 1 Chunk per manifest file (Cargo.toml / pyproject.toml / package.json / tsconfig.json / pom.xml / build.gradle / go.mod). Symbol unified to "<manifest>"; manifest type distinguished by code_lang (toml / json / xml / groovy / go-mod) read from Block::Code.lang. Oversize >200 lines splits via tier2_shared::push_chunks_with_oversize. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 13:11:46 +00:00
altair823	51004ac593	feat(p10-2): dockerfile-file-v1 chunker (whole-file 1 chunk, symbol <dockerfile>) Reads entire Dockerfile / Dockerfile.* / *.dockerfile content and emits a single Chunk with symbol "<dockerfile>", code_lang "dockerfile", line range 1..EOF. Oversize >200 lines splits into line-windows sharing the symbol via tier2_shared::push_chunks_with_oversize. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 13:09:13 +00:00
altair823	8996e73282	feat(p10-2): k8s-manifest-resource-v1 chunker + tier2_shared helper Splits multi-document YAML by ^---\s*$, requires apiVersion + kind string fields per document, emits 1 chunk per recognized k8s resource. Symbol = <kind>/<namespace>/<name> or <kind>/<name> (cluster-scoped). Invalid YAML returns 0 chunks (handled by p10-3 paragraph fallback). Oversize >200 lines splits into line-windows sharing the same symbol. tier2_shared module hosts the oversize fallback + Chunk-construction helper mirroring code_rust_ast_v1's Chunk shape. Task E (dockerfile) and Task F (manifest) will reuse it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 13:06:47 +00:00
altair823	813bdd1a16	test(p10-1c-jk): code-java-ast-v1 + code-kotlin-ast-v1 chunker snapshots Mirrors code_go_ast_snapshot pattern. In-memory CanonicalDocument (no kebab-parse-code dep — boundary §6.3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 10:57:37 +00:00
altair823	ab288135e9	test(p10-1c-go): code-go-ast-v1 chunker snapshot + full-suite gate Mirrors code_python_ast_snapshot / code_ts_ast_snapshot patterns. In-memory CanonicalDocument (no kebab-parse-code dep — boundary §6.3 respected). verify: - cargo test -p kebab-chunk --test code_go_ast_snapshot → 2/2 - cargo test --workspace --no-fail-fast -j 1 → 0 failures (all green) - cargo clippy --workspace --all-targets -- -D warnings → clean - SMOKE: chunk.ParseDoc symbol + code_lang_breakdown {"go": 1} 확인 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 09:54:17 +00:00
altair823	d6bb6cfd3b	test(p10-1b): per-language chunker snapshots (python/ts/js) Mirrors code_rust_ast_snapshot pattern. In-memory CanonicalDocument build so no kebab-parse-code dep (boundary §6.3 respected). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 01:39:17 +00:00
altair823	97e9f558f4	test(p10-1a-2): code-rust-ast-v1 chunker snapshot + full-suite gate Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 22:14:57 +00:00

13 Commits