feat(kebab-app + kebab-cli): p9-fb-18 CLI ask --session multi-turn

도그푸딩 item 14 — CLI 에서도 multi-turn 가능하도록 `kebab ask
--session <id>` 추가. p9-fb-17 의 ChatSessionRepo 위에 build, 첫 호출
세션 자동 생성, 이후 호출이 prior turns 를 history 로 받아 follow-up.
external AI integration (Claude Code skill / MCP) 도 같은 facade 로
stateful 대화 가능.

## 핵심 변경

- **`App::ask_with_session(session_id, query, opts) -> Answer`** —
  load session header → list_turns 로 prior history → 빌드 retriever
  stack (lexical / vector / hybrid 같은 분기) → `RagPipeline::ask_
  with_history` 호출 → 첫 호출이면 `chat_sessions` row 자동 생성
  (title = first_question_title) → `chat_turns` 새 row append.
- **`App::first_question_title(question)`** helper — `trim() + nfc()
  + 40 chars cap`, fallback `"untitled"`. unicode-normalization
  workspace dep 재사용.
- **`App::blake3_truncate(input)`** helper — `blake3(session_id ||
  ":" || turn_index)` 의 첫 16 byte 를 u128 으로, format!{:032x} 로
  32-hex `turn_id`.
- **`ask_with_session_with_config`** facade — CLI 진입점.
- **CLI `--session <id>` flag** — `Cmd::Ask` 의 `session: Option<
  String>` field, handler 가 None 이면 `ask_with_config` (기존
  단발), Some(id) 면 `ask_with_session_with_config` 호출.
- **에러 정책**: session create / turn append 실패 시 warn 로그
  남기고 answer 는 그대로 반환 — 사용자가 답변 받은 컴퓨트를 잃지
  않음. 영속성 실패가 답변 응답을 가로막지 않는 conservative shape.

## 테스트

- `App::first_question_title` 3 unit (trim + cap, empty → untitled,
  korean NFD → NFC)
- `App::blake3_truncate` 1 unit (deterministic + distinct across
  varying session/index)
- 워크스페이스 전체 `cargo test --workspace --no-fail-fast -j 1` exit 0
- `cargo clippy --workspace --all-targets -- -D warnings` clean

## 문서

- README `kebab ask` 행: `--session` 안내 + chat_sessions 자동 생성
  + `kebab reset --data-only` wipe 안내
- README **외부 AI 통합** 절: Claude Code skill 이 `--session` 으로
  multi-turn 가능하다는 한 문장 추가
- HANDOFF entry
- spec status planned → in_progress

## Out of scope (spec deviation)

- `--repl` (stdin loop) — spec 명시되어 있으나 stdin fixture 부담
  으로 deferral. 별도 후속 task 또는 `--session` 사용자 경험 회신
  후 결정.
- session list / show / delete 관리 명령 (spec 의 Out of scope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-03 06:20:16 +00:00
parent e8165bcbd6
commit 4f96b1b01d
6 changed files with 270 additions and 7 deletions

View File

@@ -300,6 +300,165 @@ impl App {
pipeline.ask(query, opts)
}
/// p9-fb-18: ask under a persistent chat session. Loads the
/// session's prior turns (if any), runs the query through
/// `RagPipeline::ask_with_history`, then appends the new turn
/// + (auto-)creates the session row on first use.
///
/// `session_id` is caller-supplied. If the session doesn't
/// exist yet, a new `chat_sessions` row is created with title
/// derived from the first question (≤40 chars, trimmed and
/// NFC-normalized). Subsequent calls with the same
/// `session_id` extend the conversation.
///
/// The returned `Answer` carries `conversation_id = Some(
/// session_id)` and `turn_index = Some(n)` per p9-fb-15. The
/// new `chat_turns` row is committed before this method
/// returns; on persistence error, the answer is still returned
/// (don't lose the user's compute) but the error is logged so
/// the operator notices.
pub fn ask_with_session(
&self,
session_id: &str,
query: &str,
opts: AskOpts,
) -> Result<Answer> {
use kebab_core::traits::{ChatSessionRepo, ChatSessionRow, ChatTurnRow};
use std::time::{SystemTime, UNIX_EPOCH};
// Load (or create) the session header.
let now_unix = SystemTime::now()
.duration_since(UNIX_EPOCH)
.map(|d| d.as_secs() as i64)
.unwrap_or(0);
let existing = self.sqlite.get_session(session_id)?;
let prior_turns = match &existing {
Some(_) => self.sqlite.list_turns(session_id)?,
None => Vec::new(),
};
let next_index = u32::try_from(prior_turns.len()).unwrap_or(u32::MAX);
// Build history Vec<Turn> from the persisted rows. Citations
// are decoded best-effort — a corrupted citations_json
// becomes an empty Vec rather than a panic (history is
// advisory, not authoritative).
let history: Vec<kebab_core::Turn> = prior_turns
.iter()
.map(|row| kebab_core::Turn {
question: row.question.clone(),
answer: row.answer.clone(),
citations: serde_json::from_str(&row.citations_json).unwrap_or_default(),
created_at: time::OffsetDateTime::from_unix_timestamp(row.created_at)
.unwrap_or(time::OffsetDateTime::UNIX_EPOCH),
})
.collect();
// Build the retriever stack the same way `ask` does.
let retriever: Arc<dyn Retriever> = match opts.mode {
SearchMode::Lexical => Arc::new(LexicalRetriever::with_settings(
self.sqlite.clone(),
lexical_index_version(&self.config),
self.config.search.snippet_chars,
)),
SearchMode::Vector => {
let (emb, vec_store) = self.require_embeddings()?;
let vec_iv = vector_index_version(emb.as_ref());
let vec_dyn: Arc<dyn VectorStore + Send + Sync> = vec_store;
let emb_dyn: Arc<dyn Embedder> = emb;
Arc::new(VectorRetriever::with_settings(
vec_dyn,
emb_dyn,
self.sqlite.clone(),
vec_iv,
self.config.search.snippet_chars,
))
}
SearchMode::Hybrid => {
let lex = Arc::new(LexicalRetriever::with_settings(
self.sqlite.clone(),
lexical_index_version(&self.config),
self.config.search.snippet_chars,
)) as Arc<dyn Retriever>;
let (emb, vec_store) = self.require_embeddings()?;
let vec_iv = vector_index_version(emb.as_ref());
let vec_dyn: Arc<dyn VectorStore + Send + Sync> = vec_store;
let emb_dyn: Arc<dyn Embedder> = emb;
let vec_retr = Arc::new(VectorRetriever::with_settings(
vec_dyn,
emb_dyn,
self.sqlite.clone(),
vec_iv,
self.config.search.snippet_chars,
)) as Arc<dyn Retriever>;
Arc::new(HybridRetriever::new(&self.config, lex, vec_retr))
}
};
let llm = self.llm()?;
let pipeline =
RagPipeline::new(self.config.clone(), retriever, llm, self.sqlite.clone());
let answer = pipeline.ask_with_history(
query,
history,
session_id.to_string(),
next_index,
opts,
)?;
// Auto-create the session header on first use. Title from
// the first question (≤40 chars after trim).
if existing.is_none() {
let title = first_question_title(query);
let session_row = ChatSessionRow {
session_id: session_id.to_string(),
created_at: now_unix,
updated_at: now_unix,
title: Some(title),
config_snapshot_json: serde_json::json!({
"prompt_template_version": self.config.rag.prompt_template_version,
"llm.model": self.config.models.llm.model,
"max_context_tokens": self.config.rag.max_context_tokens,
})
.to_string(),
};
if let Err(e) = self.sqlite.create_session(&session_row) {
tracing::warn!(
target: "kebab-app",
error = %e,
session_id = %session_id,
"ask_with_session: create_session failed; continuing — turn append will surface a more useful error"
);
}
}
// Append the new turn. Failure is logged but does NOT mask
// the answer — the user still gets their response, the
// operator sees the persistence error in the warn log.
let turn_id = format!(
"{:032x}",
blake3_truncate(&format!("{session_id}:{next_index}")),
);
let turn_row = ChatTurnRow {
turn_id,
session_id: session_id.to_string(),
turn_index: next_index,
question: query.to_string(),
answer: answer.answer.clone(),
citations_json: serde_json::to_string(&answer.citations).unwrap_or_else(|_| "[]".to_string()),
created_at: now_unix,
};
if let Err(e) = self.sqlite.append_turn(&turn_row) {
tracing::warn!(
target: "kebab-app",
error = %e,
session_id = %session_id,
turn_index = next_index,
"ask_with_session: append_turn failed; answer returned regardless"
);
}
Ok(answer)
}
/// Returns `true` when the workspace has embeddings turned off
/// (`provider = "none"` or `dimensions = 0`). Lexical-only mode.
pub(crate) fn embeddings_disabled(&self) -> bool {
@@ -446,3 +605,73 @@ fn vector_index_version(embedder: &dyn Embedder) -> IndexVersion {
embedder.dimensions(),
))
}
/// p9-fb-18: derive a chat-session title from the first question.
/// Trim, NFC, take first ~40 chars. Always non-empty (falls back
/// to `"untitled"`) — same defensive shape as kebab-normalize's
/// derive_title.
fn first_question_title(question: &str) -> String {
use unicode_normalization::UnicodeNormalization;
let nfc: String = question.trim().nfc().collect();
let truncated: String = nfc.chars().take(40).collect();
if truncated.is_empty() {
"untitled".to_string()
} else {
truncated
}
}
/// p9-fb-18: 32-hex `turn_id` derived from session_id + turn_index.
/// blake3 hash truncated to first 16 bytes; format as 32-char lowercase
/// hex so it slots into the `chat_turns.turn_id` column without
/// collision concerns under any realistic per-session turn count.
fn blake3_truncate(input: &str) -> u128 {
let hash = blake3::hash(input.as_bytes());
let bytes = hash.as_bytes();
let mut buf = [0u8; 16];
buf.copy_from_slice(&bytes[..16]);
u128::from_be_bytes(buf)
}
#[cfg(test)]
mod tests {
use super::*;
/// p9-fb-18: title trims, NFC-normalizes, caps at 40 chars.
#[test]
fn first_question_title_trims_and_caps() {
assert_eq!(first_question_title(" hello "), "hello");
let long = "a".repeat(100);
assert_eq!(first_question_title(&long).chars().count(), 40);
}
/// p9-fb-18: empty / whitespace-only question falls back to
/// `"untitled"` (never returns empty).
#[test]
fn first_question_title_falls_back_to_untitled() {
assert_eq!(first_question_title(""), "untitled");
assert_eq!(first_question_title(" "), "untitled");
assert_eq!(first_question_title("\t\n"), "untitled");
}
/// p9-fb-18: korean NFD → NFC.
#[test]
fn first_question_title_nfc_normalizes_korean() {
let nfd = "\u{1100}\u{1161}".to_string(); // 가 (NFD)
let title = first_question_title(&nfd);
assert_eq!(title, "\u{AC00}", "expected NFC composed form");
}
/// p9-fb-18: blake3_truncate is deterministic and differs across
/// distinct inputs.
#[test]
fn blake3_truncate_deterministic_and_distinct() {
let a = blake3_truncate("session-x:0");
let b = blake3_truncate("session-x:0");
let c = blake3_truncate("session-x:1");
let d = blake3_truncate("session-y:0");
assert_eq!(a, b, "same input → same hash");
assert_ne!(a, c, "different turn_index → different hash");
assert_ne!(a, d, "different session_id → different hash");
}
}

View File

@@ -1499,6 +1499,23 @@ pub fn ask_with_config(
App::open_with_config(config)?.ask(query, opts)
}
/// p9-fb-18: ask under a persistent chat session. Loads prior turns
/// from `chat_sessions[session_id]`, runs the query as a follow-up
/// (via `RagPipeline::ask_with_history`), and appends the new turn
/// — auto-creating the session header on first use. Returns an
/// `Answer` with `conversation_id = Some(session_id)` and
/// `turn_index` set to the new (post-append) index. CLI `kebab
/// ask --session <id>` entry point (p9-fb-18).
#[doc(hidden)]
pub fn ask_with_session_with_config(
config: kebab_config::Config,
session_id: &str,
query: &str,
opts: AskOpts,
) -> anyhow::Result<Answer> {
App::open_with_config(config)?.ask_with_session(session_id, query, opts)
}
/// Run the doctor checks against the explicit config path the user
/// requested via `--config` (or the XDG default if `None`). The
/// `config_loaded` check reports the actual path probed and the