diff --git a/crates/kebab-rag/src/pipeline.rs b/crates/kebab-rag/src/pipeline.rs index d197e04..f6ee676 100644 --- a/crates/kebab-rag/src/pipeline.rs +++ b/crates/kebab-rag/src/pipeline.rs @@ -10,7 +10,9 @@ //! 3. Pack context — fetch full chunk text via `DocumentStore` and pack //! until the `max_context_tokens` budget is exhausted (estimated at //! ~4 chars / token, matching the kb-chunk convention). -//! 4. Render the `rag-v1` prompt (system + user) verbatim per design. +//! 4. Render the configured `prompt_template_version` prompt (system + +//! user) verbatim per design — `rag-v1` legacy or `rag-v2` (default, +//! fb-40) selected via `system_prompt_for`. //! 5. Generate via `LanguageModel::generate_stream`. The token loop runs //! on the calling thread; `opts.stream_sink` (if any) emits //! `StreamEvent::RetrievalDone` once after retrieve+stale-stamp,