Commit Graph

13 Commits

Author SHA1 Message Date
31245a4328 fix(p10-1b): TS parser_version code-typescript-v1 → code-ts-v1 (naming consistency)
Task H implementer chose code-typescript-v1 but plan + design §3.3 use the
short form (chunker is code-ts-ast-v1 / code-js-ast-v1). Aligning parser
versions to match: rust=code-rust-v1 / python=code-python-v1 / ts=code-ts-v1
/ js=code-js-v1 (Task K). Fixes 2 sites: const PARSER_VERSION + integration
test assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 01:05:17 +00:00
de63f161ac feat(p10-1b): tree-sitter-typescript AST extractor (TS + TSX via grammar selection)
Adds `kebab_parse_code::typescript::TypescriptAstExtractor` (PARSER_VERSION
`code-typescript-v1`), mirroring the Python extractor (P10-1B Task E) and
the Rust scaffold (P10-1A-2). One `Block::Code` per top-level AST semantic
unit (free fn / class / each method / interface / type alias / enum,
recursively per nested class), each carrying `SourceSpan::Code` with the
unit's dotted symbol path prefixed by `module_path_for_tsjs`.

Grammar selection per `tree-sitter-typescript` 0.23: the workspace path's
`.tsx` extension routes to `LANGUAGE_TSX`, everything else to
`LANGUAGE_TYPESCRIPT`. The `export_statement` arm unwraps a `declaration`
field (`function_declaration` / `class_declaration` / `interface_declaration`
/ `type_alias_declaration` / `enum_declaration`) using the OUTER statement's
line range so `export ` is folded in; for `export default function () {}`
and `export default class {}` (where the inner node sits under the `value`
field as `function_expression` / `class` with no `name`), the symbol leaf
is `default`. Bare value exports / re-exports fall into glue.

Glue grouping reuses the Python post-pass: `<module>` only when the entire
group is imports + bare re-exports; demoted to `<top-level>` if the file
produced any real unit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 00:54:27 +00:00
9664e97497 feat(p10-1b): tree-sitter-python AST extractor (PythonAstExtractor)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 00:41:35 +00:00
dcad9ccda2 feat(p10-1b): module_path_for_python / _tsjs helpers (workspace path → module prefix)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 00:31:33 +00:00
0c61758931 build(p10-1b): add tree-sitter-python/-typescript/-javascript workspace deps
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 00:28:31 +00:00
c780aca904 fix(p10-1a-2): PR review round 2 — README wire fields + SMOKE config completeness + edge-case note + gitignore dedup
PR #140 회차 2 actionable 4건:
- README.md: `citation.kind = "code"` 행에서 wire 필드 구조 정정 — citation 안에는 `lang`, SearchHit top-level 에는 `code_lang`/`repo` (round 1 SMOKE 정정과 동일 클래스)
- docs/SMOKE.md: 격리 config 블록에 `extra_skip_globs = []` 추가 (P10 섹션의 "위 격리 config 블록 참조" 와 정합)
- crates/kebab-parse-code/src/rust.rs: comment-only 파일 → 0 blocks 동작을 module doc 에 한 줄 명시 (pdf-page-v1 의 "empty page produces no chunks" 패턴과 동일)
- .gitignore: `/target/` 제거 — `/target` (no trailing slash) 이 디렉토리 + 파일 + 심링크 모두 매칭하므로 `/target/` (dir 전용) 는 redundant

verify: `cargo check -p kebab-parse-code` clean (주석/문서 외 영향 없음).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 23:35:00 +00:00
b1d5047399 fix(p10-1a-2): PR review round 1 — doc inconsistencies + observable backfill error path
PR #140 회차 1 actionable 7건 반영:
- docs/SMOKE.md: parser_version "code-rust-ast-v1" → "code-rust-v1" (chunker_version 과 혼동); jq path .citation.code_lang → .citation.lang (wire 의 code_lang 은 SearchHit top-level)
- docs/ARCHITECTURE.md: Mermaid pcode→ptypes 잘못된 edge → pcode→core 로 정정 (kebab-parse-code Cargo.toml 실제 dep 와 일치); 디렉토리 트리에서 code-rust-ast-v1 chunker 표기 위치 kebab-parse-code → kebab-chunk 로 정정
- crates/kebab-app/src/app.rs: backfill_repo 의 .ok().flatten() 실패 silent swallow → tracing::warn 로 관측 가능, 비-abort 의도 보존
- crates/kebab-parse-code/src/rust.rs: impl_item arm 의 "function_item 만 unit 생성" 1A scope 한정 주석을 외부에서도 보이도록 arm 상단에 한 줄 추가 (내부 주석은 유지)

verify: kebab-parse-code 7/7 / kebab-app --lib 51/51 / code_ingest_smoke 3/3 green; touched-crate clippy clean (재부팅 전 검증).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 23:24:20 +00:00
df85bafa7f fix(p10-1a-2): module-prefix glue symbols + crate desc + invariant hardening (Task 6 review)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 17:35:52 +00:00
a93b33ffbe fix(p10-1a-2): correct <module> label scope + de-dup leading attribute (Task 6 review)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 17:28:49 +00:00
402a4506a2 feat(p10-1a-2): tree-sitter-rust AST extractor (parser-side)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 17:22:09 +00:00
5c265bb59f build(p10-1a-2): add tree-sitter + tree-sitter-rust workspace deps
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 15:38:19 +00:00
th-kim0823
2a8451c033 fix(p10-1a-1): tighten kebab-parse-code manifest + tests
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 16:05:34 +09:00
th-kim0823
ff11f81f7f feat(p10-1a-1): kebab-parse-code crate (lang + repo + skip)
Tasks 5-8: new `kebab-parse-code` crate with three infrastructure modules
for the code ingest framework. Ships lang.rs (extension→language identifier
mapping), repo.rs (.git walk-up via gix 0.70 for RepoMeta), and skip.rs
(BUILTIN_BLACKLIST, is_generated_file, is_oversized). 14 integration tests
across three test files, all passing; clippy -D warnings clean.

Note: gix pinned to 0.70 (not 0.83 as originally suggested) because 0.83
fails to compile against Rust 1.94.1 due to non-exhaustive match patterns
in gix-hash. 0.70 resolves cleanly and has identical head_name/head_id API.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 15:57:59 +09:00