Spec PR #59 의 §3.8 multi-turn behaviour 구현. RAG facade 가 prior turns 받아 prompt 에 prepend, retrieval query expansion 적용, Answer 에 conversation_id / turn_index 채움. 신규 (kebab-core): - Answer 에 conversation_id (Option<String>) / turn_index (Option<u32>) field 추가. serde skip_serializing_if 로 single-shot 의 wire output 변경 0 (기존 외부 wrapper 영향 없음). - Turn struct (question + answer + citations + created_at). - RefusalReason::LlmStreamAborted variant. 신규 (kebab-rag): - AskOpts 에 history (Vec<Turn>) / conversation_id / turn_index 3 field. - AskOpts::single_shot(mode) helper. - RagPipeline::ask_with_history(query, history, conversation_id, turn_index, opts) — combined opts 로 ask 호출. - expand_query_with_history: history.last() 의 answer 첫 200 자 concat 해 SearchQuery.text 확장 (spec §3.8 의 \"cheap concat\"; LLM-based standalone-question rewriting 은 P+). - serialize_history + remaining_history_budget_chars: spec 의 priority enforcement — system+question 필수, retrieved chunks 가 차지한 뒤 남은 char budget 안에서 newest 우선, oldest drop. - ask 본문: history 가 비어있지 않으면 [이전 대화] 블록을 user prompt 위에 prepend. Answer 생성 site 3 곳 (정상 / NoChunks / ScoreGate refuse) 모두 conversation_id / turn_index 채움. 신규 (kebab-store-sqlite): - refusal_reason_label 가 LlmStreamAborted → 'llm_stream_aborted'. 기존 caller 변경 (single-shot 동작 동일): - kebab-cli main.rs Cmd::Ask: AskOpts 에 history=Vec::new(), conversation_id=None, turn_index=None 명시 (CLI multi-turn 은 p9-fb-18 의 --session/--repl 가 채움). - kebab-tui src/ask.rs spawn site 동일 (multi-turn UI 는 p9-fb-16). - kebab-eval runner.rs golden eval 동일 (single-shot per query). - kebab-app tests/ask_smoke.rs / kebab-tui tests/ask.rs / kebab-rag tests/pipeline.rs / kebab-eval metrics.rs Answer literal 갱신. Test: - 9 신규 lib unit (expand_query 4 / serialize_history 3 / remaining_budget 2). - 기존 12 PASS 회귀 0. Plan 갱신: - p9-fb-15 status planned → in_progress. 머지 후 한 줄 commit 으로 completed flip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
93 lines
3.0 KiB
Rust
93 lines
3.0 KiB
Rust
//! Answer + RAG types (§3.8).
|
|
|
|
use serde::{Deserialize, Serialize};
|
|
use time::OffsetDateTime;
|
|
|
|
use crate::citation::Citation;
|
|
use crate::search::SearchMode;
|
|
use crate::versions::PromptTemplateVersion;
|
|
|
|
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
|
|
pub struct Answer {
|
|
pub answer: String,
|
|
pub citations: Vec<AnswerCitation>,
|
|
pub grounded: bool,
|
|
pub refusal_reason: Option<RefusalReason>,
|
|
pub model: ModelRef,
|
|
pub embedding: Option<ModelRef>,
|
|
pub prompt_template_version: PromptTemplateVersion,
|
|
pub retrieval: AnswerRetrievalSummary,
|
|
pub usage: TokenUsage,
|
|
#[serde(with = "time::serde::rfc3339")]
|
|
pub created_at: OffsetDateTime,
|
|
/// p9-fb-15: same conversation 의 turn 들이 공유. CLI single-shot
|
|
/// (history 없음) / TUI 첫 turn 은 None. blake3 해시 또는 사용자
|
|
/// 명시 (`kebab ask --session <id>`, p9-fb-18).
|
|
#[serde(default, skip_serializing_if = "Option::is_none")]
|
|
pub conversation_id: Option<String>,
|
|
/// p9-fb-15: 같은 conversation 안 0-based 순서. 첫 turn = 0. None
|
|
/// 이면 single-shot.
|
|
#[serde(default, skip_serializing_if = "Option::is_none")]
|
|
pub turn_index: Option<u32>,
|
|
}
|
|
|
|
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
|
|
pub struct AnswerCitation {
|
|
pub marker: Option<String>,
|
|
pub citation: Citation,
|
|
}
|
|
|
|
/// p9-fb-15: history 가 prompt 에 들어갈 때의 한 turn. RAG facade 가
|
|
/// `Vec<Turn>` 받아 system + history + retrieval + new question 으로
|
|
/// prompt 빌드. token budget 안에 fit 안 되면 oldest turn 부터 drop
|
|
/// (newest 우선 보존).
|
|
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
|
|
pub struct Turn {
|
|
pub question: String,
|
|
pub answer: String,
|
|
pub citations: Vec<AnswerCitation>,
|
|
#[serde(with = "time::serde::rfc3339")]
|
|
pub created_at: OffsetDateTime,
|
|
}
|
|
|
|
#[derive(Clone, Copy, Debug, Eq, Hash, PartialEq, Serialize, Deserialize)]
|
|
#[serde(rename_all = "snake_case")]
|
|
pub enum RefusalReason {
|
|
ScoreGate,
|
|
LlmSelfJudge,
|
|
NoIndex,
|
|
NoChunks,
|
|
/// p9-fb-15: ask 가 LLM 토큰 stream 도중 cancel 됨. partial answer
|
|
/// 가 채워져 있을 수 있음 (사용자가 본 부분까지). RAG retrieval
|
|
/// 자체는 정상 — 모델 generation 단계에서만 중단.
|
|
LlmStreamAborted,
|
|
}
|
|
|
|
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
|
|
pub struct ModelRef {
|
|
pub id: String,
|
|
pub provider: String,
|
|
pub dimensions: Option<usize>,
|
|
}
|
|
|
|
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
|
|
pub struct AnswerRetrievalSummary {
|
|
pub trace_id: TraceId,
|
|
pub mode: SearchMode,
|
|
pub k: usize,
|
|
pub score_gate: f32,
|
|
pub top_score: f32,
|
|
pub chunks_returned: u32,
|
|
pub chunks_used: u32,
|
|
}
|
|
|
|
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
|
|
pub struct TokenUsage {
|
|
pub prompt_tokens: u32,
|
|
pub completion_tokens: u32,
|
|
pub latency_ms: u32,
|
|
}
|
|
|
|
#[derive(Clone, Debug, Eq, Hash, PartialEq, Serialize, Deserialize)]
|
|
pub struct TraceId(pub String);
|