Files
kebab/crates/kebab-parse-code/src/lib.rs
th-kim0823 ff11f81f7f feat(p10-1a-1): kebab-parse-code crate (lang + repo + skip)
Tasks 5-8: new `kebab-parse-code` crate with three infrastructure modules
for the code ingest framework. Ships lang.rs (extension→language identifier
mapping), repo.rs (.git walk-up via gix 0.70 for RepoMeta), and skip.rs
(BUILTIN_BLACKLIST, is_generated_file, is_oversized). 14 integration tests
across three test files, all passing; clippy -D warnings clean.

Note: gix pinned to 0.70 (not 0.83 as originally suggested) because 0.83
fails to compile against Rust 1.94.1 due to non-exhaustive match patterns
in gix-hash. 0.70 resolves cleanly and has identical head_name/head_id API.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 15:57:59 +09:00

23 lines
899 B
Rust

//! `kebab-parse-code` — language-aware parsing for code corpora.
//!
//! Phase 1A-1 ships infrastructure only:
//!
//! - [`lang::code_lang_for_path`] — extension → language identifier.
//! - [`repo::detect_repo`] — `.git/` walk-up → repo / branch / commit metadata.
//! - [`skip::is_generated_file`] / [`skip::is_oversized`] — pre-ingest skip
//! helpers consulted by `kebab-source-fs`.
//! - [`skip::BUILTIN_BLACKLIST`] — 6-entry safety-net pattern list.
//!
//! Per-language parser modules (`rust`, `python`, `typescript`, …) land in
//! later phases (1A-2 onwards). The crate boundary follows other
//! `kebab-parse-*` crates per design §8: must NOT depend on store / embed
//! / llm / rag.
pub mod lang;
pub mod repo;
pub mod skip;
pub use lang::code_lang_for_path;
pub use repo::{RepoMeta, detect_repo};
pub use skip::{BUILTIN_BLACKLIST, is_generated_file, is_oversized};