From b81574afa9e9cf0c4a6cf3c10bbaa09c5ffe0d68 Mon Sep 17 00:00:00 2001 From: altair823 Date: Thu, 21 May 2026 22:40:04 +0000 Subject: [PATCH] =?UTF-8?q?docs(p10-1d-followup):=20HOTFIXES=20entry=20?= =?UTF-8?q?=E2=80=94=20typedef-wrapped=20struct/enum=20in=20C=20falls=20in?= =?UTF-8?q?to=20glue?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PR #156 reviewer nit #2. Documents the tension between spec body ("struct_specifier (named, top-level) → 1 unit") and the actual behavior for the C idiom `typedef struct { ... } Foo;` — the inner struct_specifier is anonymous, so the extractor falls into glue. Workaround: dogfood-driven revisit if frequent pain point emerges. Co-Authored-By: Claude Opus 4.7 (1M context) --- tasks/HOTFIXES.md | 14 ++++++++++++++ tasks/p10/p10-1d-c-cpp-ast-chunker.md | 1 + 2 files changed, 15 insertions(+) diff --git a/tasks/HOTFIXES.md b/tasks/HOTFIXES.md index f64d4ca..b6803e7 100644 --- a/tasks/HOTFIXES.md +++ b/tasks/HOTFIXES.md @@ -14,6 +14,20 @@ historical contract that was implemented; this file accumulates the deltas so phase 5+ readers can find the live behavior without diffing git history. +## 2026-05-21 — p10-1D: typedef-wrapped struct/enum in C falls into glue + +**Origin**: PR #156 (p10-1d) code-reviewer review. Verified during dogfood. + +**Symptom**: `typedef struct { ... } Foo;` in a `.c` file does NOT emit a struct-level unit. tree-sitter-c classifies the construct as a top-level `type_definition` with an *anonymous* inner `struct_specifier` (no `name` field), so the extractor's `struct_specifier` arm doesn't fire — the whole declaration falls into `` glue. The named typedef alias `Foo` is therefore not searchable as a symbol. + +**Status**: Consistent with spec p10-1d-c-cpp-ast-chunker.md's Risks/notes ("Anonymous union / struct … anonymous → glue"), but the spec's main body line 22 ("struct_specifier (named, top-level) → 1 unit") suggests this idiom WOULD emit. Tension noted, not yet fixed. + +**Workaround**: search the struct by its field/function names, or use `--code-lang c` to broaden scope. Typedef-aliased struct names won't surface as `Citation::Code.symbol`. + +**Next step**: dogfood real C code for a week+; if this turns out to be a frequent pain point (kernel-style code, libuv, etc.), revisit the extractor to detect `type_definition` → inner `struct_specifier` and emit a synthetic unit named after the typedef alias. + +Cross-link: `tasks/p10/p10-1d-c-cpp-ast-chunker.md` Risks/notes section. + ## 2026-05-20 — p10-1B: Rust 1A-2 symbol path is file-scope-only; 1B+ uses workspace path → module prefix **무엇이 바뀌었나**: P10-1A-2 의 Rust `code-rust-ast-v1` chunker 가 생성하는 symbol 은 file-scope mod-path nesting 만 사용한다 (예: `Foo::double`). P10-1B 이후 Python / TypeScript / JavaScript 의 symbol 은 workspace 경로 → module path prefix 를 포함한다 (예: `kebab_eval.metrics.compute_mrr`, `src/Foo.Foo.search`). diff --git a/tasks/p10/p10-1d-c-cpp-ast-chunker.md b/tasks/p10/p10-1d-c-cpp-ast-chunker.md index e8b891d..cad3208 100644 --- a/tasks/p10/p10-1d-c-cpp-ast-chunker.md +++ b/tasks/p10/p10-1d-c-cpp-ast-chunker.md @@ -113,6 +113,7 @@ crates/kebab-parse-code/Cargo.toml [edit] — 위 2 dep 신규 entry. - **Template specialization** (`template<> class Foo`): tree-sitter-cpp 의 `template_declaration` 안의 `class_specifier` name 만 추출 — `Foo` 만 symbol 에 들어가고 `` 미포함. design 의 generic 무시 룰 일관. - **`extern "C"` block 안의 fn**: 일반 fn 처리. 외부 wrapping block 은 glue. - **Anonymous union / struct** (`struct { int x; }` 변수 안에): 흔치 않음 + named 만 unit. anonymous 는 glue. +- **typedef-wrapped struct/enum idiom** (`typedef struct { ... } Foo;`) — anonymous inner struct → glue. Named typedef alias 미캡처. dogfood 후 HOTFIXES 검토. See [HOTFIXES.md 2026-05-21 entry](../HOTFIXES.md). - **Macro-heavy code** (Linux kernel 등): `#define FOO(x) ...` 매크로가 function-like 라도 parser 가 fn 으로 인식 안 함. preprocessor glue 로 처리 — symbol 안 잡힘. 의도된 동작 (parser 의 macro expansion 안 함). - **`__attribute__((...))`** annotations: tree-sitter-c 의 attribute 노드는 declarator 옆 sibling. 무시 가능. function name 추출에 영향 없음. - **fixture 크기**: sample.c 는 ~30 line (top-level fn + struct + enum + preprocessor), sample.cpp 는 ~50 line (nested namespace + class + method + template + free fn). oversize fallback 의 별도 검증은 1A-2 의 long_section_snapshot 패턴이 이미 cover (필요 시 별도 fixture).