chore(release): v0.3.13 — vision generate timeout 120s → 300s

gemma4:26b (25B MoE + vision encoder 550M) 등 대형 vision 모델의
cold-start 가 60-180s 소요. 기본 120s timeout 으로 첫 호출 fail 빈번.
vision path 에 한해 Math.max(timeoutMs, 300_000) — text-only 영향 없음.

gemma4:26b 가 Text+Image 양 modality 지원 검증 완료
(blog.google/gemma-4, ollama.com/library/gemma4:26b).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
th-kim0823
2026-05-12 13:39:11 +09:00
parent 713553a038
commit bd71bba2da
4 changed files with 35 additions and 4 deletions

View File

@@ -62,8 +62,12 @@ export class LocalOllamaProvider implements InferenceProvider {
? buildVisionPrompt(input.text, input.todayKst, input.dueDateCandidates.map((c) => c.iso ?? c.matchedToken ?? ''), input.vocab ?? [])
: buildPrompt(input.text, input.todayKst, input.dueDateCandidates, input.vocab ?? []);
// v0.3.13 — vision model 은 cold-start (모델 load + 이미지 encoding) 가 매우 느려
// 120s 기본 timeout 으로 첫 호출 fail 빈번. gemma4:26b (MoE 25B) 같은 대형 vision
// 모델은 첫 generate 가 60-180s 소요. 5분 (300s) 으로 확장.
const effectiveTimeout = useVision ? Math.max(this.timeoutMs, 300_000) : this.timeoutMs;
this.abortController = new AbortController();
const timer = setTimeout(() => this.abortController?.abort(), this.timeoutMs);
const timer = setTimeout(() => this.abortController?.abort(), effectiveTimeout);
try {
const body: Record<string, unknown> = {
model,