Files
kebab/crates/kebab-chunk/src/lib.rs
altair823 d53995a6d4 feat(p10-1b): code-js-ast-v1 chunker + activate JavaScript in app dispatch
Chunker: duplicate-with-substitution from code-ts-ast-v1 / code-rust-ast-v1.
Dispatch: replaces JS bail! arms with JavascriptAstExtractor + CodeJsAstV1Chunker.
Integration test javascript_file_ingests_and_searches_as_code_citation asserts
citation.lang=javascript, symbol=src/Bar.Bar.baz, code_lang=javascript.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 01:16:07 +00:00

31 lines
1.1 KiB
Rust

//! `kb-chunk` — chunkers that emit [`kebab_core::Chunk`] batches.
//!
//! Per design §3.5 (Chunk), §4.2 (chunk_id recipe), §7.2 (`Chunker`
//! trait), §0 Q3/§14 (chunking priority).
//!
//! Public surface:
//!
//! * [`MdHeadingV1Chunker`] — heading-aware chunker for Markdown
//! `CanonicalDocument`s, emitting `chunker_version = "md-heading-v1"`.
//!
//! Behavior contract is enumerated on [`MdHeadingV1Chunker`].
//!
//! This crate must NOT depend on any parser implementation
//! (`kb-parse-md`, `kb-parse-pdf`, …), the document/vector store, the
//! embedder, the retriever, the LLM, the RAG layer, or the UI layers.
//! It consumes `CanonicalDocument` purely through `kb-core` types.
mod code_js_ast_v1;
mod code_python_ast_v1;
mod code_rust_ast_v1;
mod code_ts_ast_v1;
mod md_heading_v1;
mod pdf_page_v1;
pub use code_js_ast_v1::CodeJsAstV1Chunker;
pub use code_python_ast_v1::CodePythonAstV1Chunker;
pub use code_rust_ast_v1::CodeRustAstV1Chunker;
pub use code_ts_ast_v1::CodeTsAstV1Chunker;
pub use md_heading_v1::MdHeadingV1Chunker;
pub use pdf_page_v1::PdfPageV1Chunker;