Spec PR #59 의 §3.8 multi-turn behaviour 구현. RAG facade 가 prior turns 받아 prompt 에 prepend, retrieval query expansion 적용, Answer 에 conversation_id / turn_index 채움. 신규 (kebab-core): - Answer 에 conversation_id (Option<String>) / turn_index (Option<u32>) field 추가. serde skip_serializing_if 로 single-shot 의 wire output 변경 0 (기존 외부 wrapper 영향 없음). - Turn struct (question + answer + citations + created_at). - RefusalReason::LlmStreamAborted variant. 신규 (kebab-rag): - AskOpts 에 history (Vec<Turn>) / conversation_id / turn_index 3 field. - AskOpts::single_shot(mode) helper. - RagPipeline::ask_with_history(query, history, conversation_id, turn_index, opts) — combined opts 로 ask 호출. - expand_query_with_history: history.last() 의 answer 첫 200 자 concat 해 SearchQuery.text 확장 (spec §3.8 의 \"cheap concat\"; LLM-based standalone-question rewriting 은 P+). - serialize_history + remaining_history_budget_chars: spec 의 priority enforcement — system+question 필수, retrieved chunks 가 차지한 뒤 남은 char budget 안에서 newest 우선, oldest drop. - ask 본문: history 가 비어있지 않으면 [이전 대화] 블록을 user prompt 위에 prepend. Answer 생성 site 3 곳 (정상 / NoChunks / ScoreGate refuse) 모두 conversation_id / turn_index 채움. 신규 (kebab-store-sqlite): - refusal_reason_label 가 LlmStreamAborted → 'llm_stream_aborted'. 기존 caller 변경 (single-shot 동작 동일): - kebab-cli main.rs Cmd::Ask: AskOpts 에 history=Vec::new(), conversation_id=None, turn_index=None 명시 (CLI multi-turn 은 p9-fb-18 의 --session/--repl 가 채움). - kebab-tui src/ask.rs spawn site 동일 (multi-turn UI 는 p9-fb-16). - kebab-eval runner.rs golden eval 동일 (single-shot per query). - kebab-app tests/ask_smoke.rs / kebab-tui tests/ask.rs / kebab-rag tests/pipeline.rs / kebab-eval metrics.rs Answer literal 갱신. Test: - 9 신규 lib unit (expand_query 4 / serialize_history 3 / remaining_budget 2). - 기존 12 PASS 회귀 0. Plan 갱신: - p9-fb-15 status planned → in_progress. 머지 후 한 줄 commit 으로 completed flip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2.4 KiB
2.4 KiB
phase, component, task_id, title, status, depends_on, unblocks, contract_source, contract_sections, source_feedback
| phase | component | task_id | title | status | depends_on | unblocks | contract_source | contract_sections | source_feedback | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| P9 | kebab-rag + kebab-app | p9-fb-15 | RAG multi-turn — history-aware prompt + token budget | in_progress |
|
../../docs/superpowers/specs/2026-04-27-kebab-final-form-design.md |
|
p9-dogfooding-feedback.md item 13 |
p9-fb-15 — RAG multi-turn core
Goal
kebab-rag 가 conversation history (Vec<Turn>) 를 받아 prompt 빌드. token budget 안에서 retrieval k 와 history truncation 정책으로 fit.
Allowed dependencies
- 기존 kebab-rag deps.
tiktoken-rs또는 LLM family-specific tokenizer (gemma 토큰화). 우선 char 기반 ÷4 근사 (cheap & 의존 X).
Public surface
pub struct Turn {
pub question: String,
pub answer: String,
pub citations: Vec<Citation>,
pub ts: OffsetDateTime,
}
pub fn ask_with_history(
cfg: &Config,
new_question: &str,
history: &[Turn],
stream: Sender<RagEvent>,
) -> anyhow::Result<Answer>;
kebab-app 도 ask_with_config_and_history(cfg, q, history, stream) 추가.
Behavior contract
- prompt 구조:
system_prompt + history_serialized + retrieved_chunks + new_question. 형식 (roles 또는 plain text) 는prompt_template_versionbump (rag-v1→rag-v2). - token budget:
cfg.rag.max_context_tokens.- 우선순위: system + new_question 항상 포함.
- 다음: retrieved chunks (k=cfg.search.default_k 부터, budget 초과시 k 감소).
- 마지막: history. budget 남은 만큼 newest turn 부터 포함, 부족하면 oldest turn drop. 최소 0 turn 까지 가능.
- retrieval query:
new_question + " " + last_turn.answer.first_N_chars(200)concat (cheap query expansion). LLM 기반 standalone question rewriting 은 P+. - streaming:
RagEvent::Token(s)/RagEvent::Done(answer)/RagEvent::Error(e).
Test plan
| kind | description |
|---|---|
| unit | history 5 turn → token budget 초과 시 oldest 부터 drop |
| unit | retrieved_chunks vs history 의 priority |
| integration | 가짜 history (Q1/A1) + new Q2 → prompt 에 Q1/A1 포함 (snapshot) |
DoD
cargo test -p kebab-rag -p kebab-app통과prompt_template_versionbump (rag-v2)- HOTFIXES X (신규)
- frozen design §7 RAG 절 갱신 (multi-turn 정책)
Out of scope
- LLM 기반 question rewriting (P+)
- conversation 영속화 (p9-fb-17)
- UI (p9-fb-16, p9-fb-18)