2026-05-10 03:10:38 +00:00
29 changed files with 1792 additions and 83 deletions
--- a/docs/superpowers/plans/2026-05-10-v031-cut-f-vision.md
+++ b/docs/superpowers/plans/2026-05-10-v031-cut-f-vision.md
@@ -0,0 +1,841 @@
+# v0.3.1 Cut F — 멀티모달 vision AI Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** F24 — Ollama vision 모델 (gemma3 family default) 활용. 이미지 + raw_text 결합 prompt → title/summary/tags 자동 생성. Capability detection (app launch + manual refresh) + InferenceProvider 확장 + AiWorker 통합 + Configure UI dropdown.
+
+**Architecture:** `isVisionCapable(model)` pure 함수 가 family/families/name 기반으로 vision 가능 모델 판정. `refreshVisionCache(deps)` 가 `/api/tags` 호출 후 settings 에 cache. AiWorker 가 `note.media.length > 0 && visionModel` 둘 다 충족 시 vision path (5MB cap + base64 변환). Configure UI 가 cache 기반 dropdown + manual refresh.
+
+**Tech Stack:** undici/fetch (Ollama API), Node fs/promises (이미지 base64 변환), Electron IPC, React 19 + zustand 5, vitest 4 + RTL.
+
+**선행 문서:**
+
+- `docs/superpowers/specs/2026-05-09-v031-cut-f-design.md` — source spec (Cut F 정정 반영: 단위 679, 실제 SettingsService API, 'skipped' enum 미도입, fallback 미구현)
+- `docs/superpowers/specs/2026-04-25-dogfood-feedback.md` — F24
+- `docs/superpowers/strategy/v028plus-roadmap.md` — Cut F 위치
+
+---
+
+## File Structure
+
+**Create:**
+
+- `src/main/services/VisionDetect.ts` — `isVisionCapable(model)` pure + `refreshVisionCache(deps)` async (Ollama /api/tags)
+- `src/main/ai/visionPrompt.ts` — `buildVisionPrompt(text, todayKst, dueCandidates, vocab)` pure
+- `src/renderer/inbox/components/settings/VisionSection.tsx` — AI 제공자 섹션 안 또는 별도 sub-section. dropdown + 다시 감지 버튼
+- `tests/unit/VisionDetect.test.ts` — isVisionCapable 5 + refreshVisionCache 4
+- `tests/unit/visionPrompt.test.ts` — buildVisionPrompt 2 (text only / image-only fallback)
+- `tests/unit/AiWorker.vision.test.ts` — vision path 3 (text-only / vision body / 5MB cap)
+- `tests/unit/VisionSection.test.tsx` — UI 1 (dropdown + 다시 감지)
+
+**Modify:**
+
+- `src/main/services/SettingsService.ts` — zod schema vision_model / vision_capable_cache / vision_cache_at + 4 메서드
+- `src/main/ai/InferenceProvider.ts` — `GenerateInput.images?: Array<{ base64: string; mime: string }>` + `generate(input, opts?: { visionModel?: string | null })`
+- `src/main/ai/LocalOllamaProvider.ts` — `generate` body 에 `images` 필드 (vision path) + 모델 분기
+- `src/main/ai/AiWorker.ts` — `note.media + visionModel` vision path + 5MB cap + base64 변환. 생성자에 `settings: SettingsService` 의존성 추가
+- `src/main/ipc/settingsApi.ts` — 3 IPC: `settings:get-vision-models` / `settings:set-vision-model` / `settings:refresh-vision-cache`
+- `src/preload/index.ts` — 3 bridge
+- `src/shared/types.ts` — `getSettings()` 반환에 vision_* 3 필드 + InboxApi 3 메서드
+- `src/main/index.ts` — `void refreshVisionCache(...)` whenReady 안 + AiWorker 생성자에 settings 주입
+- `src/renderer/inbox/components/settings/AiProviderSection.tsx` 또는 SettingsPage — VisionSection 마운트
+- `tests/unit/SettingsService.test.ts` — vision 4 메서드 round-trip
+- `tests/unit/LocalOllamaProvider.test.ts` — vision body 분기 회귀
+- `tests/unit/AiWorker.test.ts` — 기존 mock 에 settings stub 추가 (생성자 변경)
+- `package.json` — version 0.3.0 → 0.3.1
+- `docs/superpowers/specs/2026-04-25-dogfood-feedback.md` — F24 promoted
+
+---
+
+## 단위 목표
+
+679 (v0.3.0) → 약 701 (+22), typecheck 0.
+
+---
+
+## Task 1: VisionDetect — `isVisionCapable` + `refreshVisionCache`
+
+**Files:**
+
+- Create: `src/main/services/VisionDetect.ts`
+- Create: `tests/unit/VisionDetect.test.ts`
+
+`isVisionCapable(model)` pure 함수 — family/families/name hints 기반 판정. `refreshVisionCache(deps)` async — `/api/tags` 호출 후 capable 추출 + settings cache 저장. fetch 주입 가능 (테스트).
+
+- [ ] **Step 1: failing test** — `tests/unit/VisionDetect.test.ts`:
+
+```ts
+import { describe, it, expect, vi } from 'vitest';
+import { isVisionCapable, refreshVisionCache } from '../../src/main/services/VisionDetect.js';
+
+describe('isVisionCapable', () => {
+  it('family=gemma3 → true', () => {
+    expect(isVisionCapable({ name: 'gemma3:12b', details: { family: 'gemma3' } })).toBe(true);
+  });
+  it('families=[llava] → true', () => {
+    expect(isVisionCapable({ name: 'llava-13b', details: { families: ['llava'] } })).toBe(true);
+  });
+  it('name hint "vision" → true', () => {
+    expect(isVisionCapable({ name: 'custom-vision-7b' })).toBe(true);
+  });
+  it('text-only family=gemma → false', () => {
+    expect(isVisionCapable({ name: 'gemma4:e4b', details: { family: 'gemma' } })).toBe(false);
+  });
+  it('no hints + unknown family → false', () => {
+    expect(isVisionCapable({ name: 'mistral:7b', details: { family: 'mistral' } })).toBe(false);
+  });
+});
+
+describe('refreshVisionCache', () => {
+  it('happy path — capable 추출 + settings cache 저장', async () => {
+    const settings = {
+      isAiEnabled: vi.fn(async () => true),
+      setVisionCapableCache: vi.fn(async () => {})
+    };
+    const fetchImpl = vi.fn(async () => ({
+      ok: true,
+      status: 200,
+      json: async () => ({
+        models: [
+          { name: 'gemma4:e4b', details: { family: 'gemma' } },
+          { name: 'gemma3:12b-vision', details: { family: 'gemma3' } },
+          { name: 'llava:13b', details: { families: ['llava'] } }
+        ]
+      })
+    })) as unknown as typeof fetch;
+    const r = await refreshVisionCache({
+      settings: settings as never,
+      endpoint: 'http://localhost:11434',
+      fetchImpl
+    });
+    expect(r).toEqual({ ok: true, models: ['gemma3:12b-vision', 'llava:13b'] });
+    expect(settings.setVisionCapableCache).toHaveBeenCalledWith(['gemma3:12b-vision', 'llava:13b'], expect.any(Date));
+  });
+
+  it('ai_disabled → 스킵', async () => {
+    const settings = {
+      isAiEnabled: vi.fn(async () => false),
+      setVisionCapableCache: vi.fn(async () => {})
+    };
+    const r = await refreshVisionCache({ settings: settings as never, endpoint: 'http://x' });
+    expect(r).toEqual({ ok: false, reason: 'ai_disabled' });
+    expect(settings.setVisionCapableCache).not.toHaveBeenCalled();
+  });
+
+  it('http error → ok:false', async () => {
+    const settings = {
+      isAiEnabled: vi.fn(async () => true),
+      setVisionCapableCache: vi.fn(async () => {})
+    };
+    const fetchImpl = vi.fn(async () => ({
+      ok: false,
+      status: 500,
+      json: async () => ({})
+    })) as unknown as typeof fetch;
+    const r = await refreshVisionCache({ settings: settings as never, endpoint: 'http://x', fetchImpl });
+    expect(r).toMatchObject({ ok: false });
+    expect(settings.setVisionCapableCache).not.toHaveBeenCalled();
+  });
+
+  it('unreachable → ok:false', async () => {
+    const settings = {
+      isAiEnabled: vi.fn(async () => true),
+      setVisionCapableCache: vi.fn(async () => {})
+    };
+    const fetchImpl = vi.fn(async () => { throw new Error('ECONNREFUSED'); }) as unknown as typeof fetch;
+    const r = await refreshVisionCache({ settings: settings as never, endpoint: 'http://x', fetchImpl });
+    expect(r).toMatchObject({ ok: false });
+  });
+});
+```
+
+- [ ] **Step 2: implementation** — `src/main/services/VisionDetect.ts`:
+
+```ts
+import type { SettingsService } from './SettingsService.js';
+
+const VISION_FAMILIES = new Set(['gemma3', 'llava', 'llama3.2-vision', 'minicpm-v', 'pixtral']);
+const VISION_NAME_HINTS = ['vision', 'vl', 'multimodal', 'gemma3'];
+
+export interface OllamaModel {
+  name: string;
+  details?: { family?: string; families?: string[] };
+}
+
+export function isVisionCapable(model: OllamaModel): boolean {
+  if (model.details?.family && VISION_FAMILIES.has(model.details.family)) return true;
+  if (model.details?.families?.some((f) => VISION_FAMILIES.has(f))) return true;
+  const lower = model.name.toLowerCase();
+  return VISION_NAME_HINTS.some((h) => lower.includes(h));
+}
+
+export interface RefreshDeps {
+  settings: SettingsService;
+  endpoint: string;
+  now?: () => Date;
+  fetchImpl?: typeof fetch;
+}
+
+export async function refreshVisionCache(
+  deps: RefreshDeps
+): Promise<{ ok: true; models: string[] } | { ok: false; reason: string }> {
+  if (!(await deps.settings.isAiEnabled())) {
+    return { ok: false, reason: 'ai_disabled' };
+  }
+  const fetchFn = deps.fetchImpl ?? fetch;
+  let body: { models?: OllamaModel[] };
+  try {
+    const r = await fetchFn(`${deps.endpoint}/api/tags`);
+    if (!r.ok) return { ok: false, reason: `tags http ${r.status}` };
+    body = (await r.json()) as { models?: OllamaModel[] };
+  } catch (e) {
+    return { ok: false, reason: `unreachable: ${(e as Error).message}` };
+  }
+  const capable = (body.models ?? []).filter(isVisionCapable).map((m) => m.name);
+  const now = deps.now ? deps.now() : new Date();
+  await deps.settings.setVisionCapableCache(capable, now);
+  return { ok: true, models: capable };
+}
+```
+
+- [ ] **Step 3: PASS + commit**
+
+```bash
+npm run typecheck
+npx vitest run tests/unit/VisionDetect.test.ts
+git add src/main/services/VisionDetect.ts tests/unit/VisionDetect.test.ts
+git commit -m "feat(v031): VisionDetect — isVisionCapable + refreshVisionCache (fetch 주입)"
+```
+
+---
+
+## Task 2: SettingsService — vision_model / vision_capable_cache + 4 메서드
+
+**Files:**
+
+- Modify: `src/main/services/SettingsService.ts`
+- Modify: `tests/unit/SettingsService.test.ts`
+
+zod schema 확장 + 4 메서드 추가 (Cut E sync_* 패턴).
+
+- [ ] **Step 1: zod schema 확장** — `src/main/services/SettingsService.ts`:
+
+```ts
+const SettingsSchema = z.object({
+  ollama: OllamaSettingsSchema.optional(),
+  ai_enabled: z.boolean().optional(),
+  onboarding_completed: z.boolean().optional(),
+  sync_repo_url: z.string().nullable().optional(),
+  sync_auto_enabled: z.boolean().optional(),
+  sync_interval_min: z.number().int().min(5).optional(),
+  // v0.3.1 Cut F — vision 모델 (이미지 분석). null/없음 = 비활성.
+  vision_model: z.string().nullable().optional(),
+  vision_capable_cache: z.array(z.string()).optional(),
+  vision_cache_at: z.string().optional()
+}).strict();
+```
+
+- [ ] **Step 2: 4 메서드 추가** (`setSyncIntervalMin` 다음):
+
+```ts
+async getVisionModel(): Promise<string | null> {
+  const s = await this.load();
+  return s.vision_model ?? null;
+}
+
+async setVisionModel(value: string | null): Promise<void> {
+  const current = await this.load();
+  const next: Settings = { ...current, vision_model: value };
+  await this.persist(next);
+}
+
+async getVisionCapableCache(): Promise<{ models: string[]; at: string | null }> {
+  const s = await this.load();
+  return { models: s.vision_capable_cache ?? [], at: s.vision_cache_at ?? null };
+}
+
+async setVisionCapableCache(models: string[], now: Date): Promise<void> {
+  const current = await this.load();
+  const next: Settings = { ...current, vision_capable_cache: models, vision_cache_at: now.toISOString() };
+  await this.persist(next);
+}
+```
+
+- [ ] **Step 3: failing test** — `tests/unit/SettingsService.test.ts` 의 마지막 describe (Cut E sync) 다음에 추가:
+
+```ts
+describe('v0.3.1 Cut F — vision settings', () => {
+  it('getVisionModel() 기본 null', async () => {
+    expect(await svc.getVisionModel()).toBeNull();
+  });
+
+  it('setVisionModel / getVisionModel round-trip + null clear', async () => {
+    await svc.setVisionModel('gemma3:12b-vision');
+    expect(await svc.getVisionModel()).toBe('gemma3:12b-vision');
+    await svc.setVisionModel(null);
+    expect(await svc.getVisionModel()).toBeNull();
+  });
+
+  it('getVisionCapableCache() 기본 빈 배열 + null at', async () => {
+    expect(await svc.getVisionCapableCache()).toEqual({ models: [], at: null });
+  });
+
+  it('setVisionCapableCache 저장 + at ISO', async () => {
+    const at = new Date('2026-05-10T05:00:00Z');
+    await svc.setVisionCapableCache(['gemma3:12b', 'llava:13b'], at);
+    const r = await svc.getVisionCapableCache();
+    expect(r.models).toEqual(['gemma3:12b', 'llava:13b']);
+    expect(r.at).toBe('2026-05-10T05:00:00.000Z');
+  });
+});
+```
+
+- [ ] **Step 4: PASS + commit**
+
+```bash
+npm run typecheck
+npx vitest run tests/unit/SettingsService.test.ts
+git add src/main/services/SettingsService.ts tests/unit/SettingsService.test.ts
+git commit -m "feat(v031): SettingsService.{getVisionModel,setVisionModel,getVisionCapableCache,setVisionCapableCache}"
+```
+
+---
+
+## Task 3: visionPrompt + InferenceProvider 인터페이스 확장
+
+**Files:**
+
+- Create: `src/main/ai/visionPrompt.ts`
+- Modify: `src/main/ai/InferenceProvider.ts`
+- Create: `tests/unit/visionPrompt.test.ts`
+
+`buildVisionPrompt(text, todayKst, dueCandidates, vocab)` pure — 이미지 + raw_text 결합 시나리오. 빈 text 도 처리 ("(이미지만 있음)" placeholder).
+
+- [ ] **Step 1: failing test** — `tests/unit/visionPrompt.test.ts`:
+
+```ts
+import { describe, it, expect } from 'vitest';
+import { buildVisionPrompt } from '../../src/main/ai/visionPrompt.js';
+
+describe('buildVisionPrompt', () => {
+  it('text + 이미지 시 메모 본문 포함', () => {
+    const r = buildVisionPrompt('회의 메모', '2026-05-10', ['2026-05-10'], ['회의']);
+    expect(r).toContain('회의 메모');
+    expect(r).toContain('2026-05-10');
+    expect(r).toContain('회의');
+  });
+
+  it('빈 text → "(이미지만 있음)" placeholder', () => {
+    const r = buildVisionPrompt('', '2026-05-10', [], []);
+    expect(r).toContain('(이미지만 있음)');
+  });
+});
+```
+
+- [ ] **Step 2: implementation** — `src/main/ai/visionPrompt.ts`:
+
+```ts
+/**
+ * v0.3.1 Cut F — 멀티모달 vision prompt. 이미지 + raw_text 결합 분석 후
+ * title/summary/tags/due_date JSON 응답 요청. 빈 raw_text 도 처리.
+ */
+export function buildVisionPrompt(
+  text: string,
+  todayKst: string,
+  dueCandidates: string[],
+  vocab: string[]
+): string {
+  return `다음 메모와 첨부 이미지를 종합 분석해 한국어로 요약하세요.
+
+메모 본문 (비어 있을 수 있음):
+${text || '(이미지만 있음)'}
+
+이미지 분석 시 주요 시각적 정보 (텍스트, 사람, 장면) 도 포함해 요약하세요.
+출력 JSON: { "title": "...", "summary": "...", "tags": [...], "due_date": "..." }
+오늘: ${todayKst}
+가능한 due 후보: ${dueCandidates.join(', ')}
+빈출 태그: ${vocab.slice(0, 20).join(', ')}`;
+}
+```
+
+- [ ] **Step 3: InferenceProvider 인터페이스 확장** — `src/main/ai/InferenceProvider.ts`:
+
+```ts
+export interface GenerateInput {
+  text: string;
+  todayKst: string;
+  dueDateCandidates: string[];
+  vocab?: string[];
+  // v0.3.1 Cut F — 멀티모달 vision (옵션). LocalOllamaProvider 가 visionModel 과 함께 처리.
+  images?: Array<{ base64: string; mime: string }>;
+}
+
+export interface GenerateOptions {
+  visionModel?: string | null;
+}
+
+export interface InferenceProvider {
+  generate(input: GenerateInput, opts?: GenerateOptions): Promise<AiResponse>;
+  // ... 기존 abort / generateRaw
+}
+```
+
+(기존 호출자는 `opts` 미전달이라 호환 — vision path off.)
+
+- [ ] **Step 4: PASS + commit**
+
+```bash
+npm run typecheck
+npx vitest run tests/unit/visionPrompt.test.ts
+git add src/main/ai/visionPrompt.ts src/main/ai/InferenceProvider.ts tests/unit/visionPrompt.test.ts
+git commit -m "feat(v031): buildVisionPrompt + GenerateInput.images + GenerateOptions.visionModel"
+```
+
+---
+
+## Task 4: LocalOllamaProvider — vision path
+
+**Files:**
+
+- Modify: `src/main/ai/LocalOllamaProvider.ts`
+- Modify: `tests/unit/LocalOllamaProvider.test.ts`
+
+`generate(input, opts)` 가 `opts.visionModel + input.images` 둘 다 있으면 vision body 생성 (model = visionModel, prompt = buildVisionPrompt, body.images = base64 array). 그 외는 기존 text-only path.
+
+- [ ] **Step 1: failing test** — 기존 `LocalOllamaProvider.test.ts` 의 적절한 describe 안:
+
+```ts
+describe('vision path (v0.3.1 Cut F)', () => {
+  it('opts.visionModel + input.images 둘 다 있으면 vision body', async () => {
+    let captured: { model?: string; prompt?: string; images?: string[] } = {};
+    const undici = await import('undici');
+    const requestSpy = vi.spyOn(undici, 'request').mockImplementation(async (_url, init) => {
+      captured = JSON.parse(init?.body as string);
+      return {
+        statusCode: 200,
+        body: { json: async () => ({ response: '{"title":"t","summary":"s","tags":[],"due_date":null}' }) }
+      } as never;
+    });
+    const provider = new LocalOllamaProvider({ endpoint: 'http://x', model: 'gemma4:e4b' });
+    await provider.generate(
+      { text: 'hi', todayKst: '2026-05-10', dueDateCandidates: [], images: [{ base64: 'AAAA', mime: 'image/png' }] },
+      { visionModel: 'gemma3:12b-vision' }
+    );
+    expect(captured.model).toBe('gemma3:12b-vision');
+    expect(captured.prompt).toContain('이미지');
+    expect(captured.images).toEqual(['AAAA']);
+    requestSpy.mockRestore();
+  });
+
+  it('visionModel 있어도 images 없으면 text-only path', async () => {
+    let captured: { model?: string; images?: unknown } = {};
+    const undici = await import('undici');
+    const requestSpy = vi.spyOn(undici, 'request').mockImplementation(async (_url, init) => {
+      captured = JSON.parse(init?.body as string);
+      return {
+        statusCode: 200,
+        body: { json: async () => ({ response: '{"title":"t","summary":"s","tags":[],"due_date":null}' }) }
+      } as never;
+    });
+    const provider = new LocalOllamaProvider({ endpoint: 'http://x', model: 'gemma4:e4b' });
+    await provider.generate(
+      { text: 'hi', todayKst: '2026-05-10', dueDateCandidates: [] },
+      { visionModel: 'gemma3:12b-vision' }
+    );
+    expect(captured.model).toBe('gemma4:e4b');
+    expect(captured.images).toBeUndefined();
+    requestSpy.mockRestore();
+  });
+
+  it('opts 미전달 → 기존 text-only (회귀)', async () => {
+    let captured: { model?: string; images?: unknown } = {};
+    const undici = await import('undici');
+    const requestSpy = vi.spyOn(undici, 'request').mockImplementation(async (_url, init) => {
+      captured = JSON.parse(init?.body as string);
+      return {
+        statusCode: 200,
+        body: { json: async () => ({ response: '{"title":"t","summary":"s","tags":[],"due_date":null}' }) }
+      } as never;
+    });
+    const provider = new LocalOllamaProvider({ endpoint: 'http://x', model: 'gemma4:e4b' });
+    await provider.generate({ text: 'hi', todayKst: '2026-05-10', dueDateCandidates: [] });
+    expect(captured.model).toBe('gemma4:e4b');
+    expect(captured.images).toBeUndefined();
+    requestSpy.mockRestore();
+  });
+});
+```
+
+(기존 LocalOllamaProvider.test.ts 의 mock 패턴 따름. test file 의 imports + vi.mock 은 그대로 사용.)
+
+- [ ] **Step 2: implementation** — `LocalOllamaProvider.generate` body 분기:
+
+```ts
+import { buildVisionPrompt } from './visionPrompt.js';
+// ...
+
+async generate(input: GenerateInput, opts?: GenerateOptions): Promise<AiResponse> {
+  const useVision = !!opts?.visionModel && (input.images?.length ?? 0) > 0;
+  const model = useVision ? opts!.visionModel! : this.model;
+  const prompt = useVision
+    ? buildVisionPrompt(input.text, input.todayKst, input.dueDateCandidates, input.vocab ?? [])
+    : buildPrompt(input.text, input.todayKst, input.dueDateCandidates, input.vocab ?? []);
+
+  this.abortController = new AbortController();
+  const timer = setTimeout(() => this.abortController?.abort(), this.timeoutMs);
+  try {
+    const body: Record<string, unknown> = {
+      model,
+      prompt,
+      format: 'json',
+      stream: false,
+      options: { temperature: this.temperature, num_predict: this.numPredict }
+    };
+    if (useVision) {
+      body.images = input.images!.map((i) => i.base64);
+    }
+    const res = await request(`${this.endpoint}/api/generate`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify(body),
+      signal: this.abortController.signal
+    });
+    // ... 기존 parse
+  } finally {
+    // ...
+  }
+}
+```
+
+- [ ] **Step 3: PASS + commit**
+
+```bash
+npm run typecheck
+npx vitest run tests/unit/LocalOllamaProvider.test.ts
+git add src/main/ai/LocalOllamaProvider.ts tests/unit/LocalOllamaProvider.test.ts
+git commit -m "feat(v031): LocalOllamaProvider vision path (visionModel + images → body.images base64)"
+```
+
+---
+
+## Task 5: AiWorker — vision integration + 5MB cap + settings 의존성
+
+**Files:**
+
+- Modify: `src/main/ai/AiWorker.ts`
+- Modify: `tests/unit/AiWorker.test.ts`
+- Create: `tests/unit/AiWorker.vision.test.ts`
+
+AiWorker 가 `note.media + visionModel` 조건에서 base64 변환 (5MB cap) + provider.generate 에 images + visionModel 전달. 생성자에 `settings: SettingsService` 의존성 추가.
+
+- [ ] **Step 1: AiWorker 생성자 변경** — settings 파라미터 추가. `src/main/index.ts` 의 인스턴스 생성도 갱신.
+
+- [ ] **Step 2: AiWorker.processJob 갱신**:
+
+```ts
+import { readFile } from 'node:fs/promises';
+// 클래스 안 generate 호출 직전:
+const visionModel = await this.settings.getVisionModel();
+let images: Array<{ base64: string; mime: string }> | undefined;
+if (visionModel && note.media.length > 0) {
+  images = await Promise.all(
+    note.media.map(async (m) => {
+      const buf = await readFile(this.mediaStore.absolutePath(m.relPath));
+      if (buf.byteLength > 5 * 1024 * 1024) {
+        throw new Error(`image ${m.relPath} exceeds 5MB cap`);
+      }
+      return { base64: buf.toString('base64'), mime: m.mime };
+    })
+  );
+}
+const res = await this.holder.get().generate(
+  { text: note.rawText, images, todayKst: todayIso, dueDateCandidates: candidates, vocab },
+  { visionModel: visionModel ?? undefined }
+);
+```
+
+`mediaStore: MediaStore` 도 AiWorker 생성자에 신규 파라미터 (현재 없으면 추가; main 에서 주입).
+
+- [ ] **Step 3: failing test** — `tests/unit/AiWorker.vision.test.ts`:
+
+```ts
+import { describe, it, expect, beforeEach, vi } from 'vitest';
+import { writeFile, mkdtemp, mkdir, rm } from 'node:fs/promises';
+import { tmpdir } from 'node:os';
+import { join } from 'node:path';
+import Database from 'better-sqlite3';
+import { runMigrations } from '../../src/main/db/migrations/index.js';
+import { NoteRepository } from '../../src/main/repository/NoteRepository.js';
+import { AiWorker } from '../../src/main/ai/AiWorker.js';
+import { MediaStore } from '../../src/main/services/MediaStore.js';
+
+describe('AiWorker — vision path (v0.3.1 Cut F)', () => {
+  let db: Database.Database;
+  let repo: NoteRepository;
+  let workDir: string;
+  let mediaStore: MediaStore;
+
+  beforeEach(async () => {
+    db = new Database(':memory:');
+    db.pragma('foreign_keys = ON');
+    runMigrations(db);
+    repo = new NoteRepository(db);
+    workDir = await mkdtemp(join(tmpdir(), 'inkling-vision-'));
+    mediaStore = new MediaStore(workDir);
+  });
+
+  afterEach(async () => {
+    db.close();
+    await rm(workDir, { recursive: true, force: true });
+  });
+
+  it('visionModel + media 있음 → provider.generate 가 images + opts 받음', async () => {
+    const { id } = repo.create({ rawText: '이미지 메모' });
+    const mediaPath = join(workDir, 'media', id, '1.png');
+    await mkdir(join(workDir, 'media', id), { recursive: true });
+    await writeFile(mediaPath, Buffer.from([0x89, 0x50, 0x4e, 0x47]));  // 4 bytes PNG-ish
+    repo.insertMedia([{ noteId: id, kind: 'image', relPath: `media/${id}/1.png`, mime: 'image/png', bytes: 4 }]);
+
+    const generate = vi.fn(async () => ({ title: 't', summary: 's', tags: [], dueDate: null }));
+    const provider = { name: 'fake', generate, abort: () => {} };
+    const settings = {
+      getVisionModel: vi.fn(async () => 'gemma3:12b-vision'),
+      isAiEnabled: vi.fn(async () => true)
+    } as unknown as never;
+    const worker = new AiWorker(/* ...deps with settings + mediaStore + repo + holder = { get: () => provider } */);
+    await worker['processJob']({ noteId: id, attempts: 0, nextRunAt: '' });
+
+    expect(generate).toHaveBeenCalledWith(
+      expect.objectContaining({ images: expect.any(Array) }),
+      expect.objectContaining({ visionModel: 'gemma3:12b-vision' })
+    );
+    const callArg = generate.mock.calls[0]![0] as { images: Array<{ base64: string; mime: string }> };
+    expect(callArg.images).toHaveLength(1);
+    expect(callArg.images[0]!.mime).toBe('image/png');
+  });
+
+  it('visionModel 없으면 text-only (회귀)', async () => {
+    const { id } = repo.create({ rawText: 'just text' });
+    const generate = vi.fn(async () => ({ title: 't', summary: 's', tags: [], dueDate: null }));
+    const provider = { name: 'fake', generate, abort: () => {} };
+    const settings = {
+      getVisionModel: vi.fn(async () => null),
+      isAiEnabled: vi.fn(async () => true)
+    } as unknown as never;
+    const worker = new AiWorker(/* ... */);
+    await worker['processJob']({ noteId: id, attempts: 0, nextRunAt: '' });
+    expect(generate).toHaveBeenCalledWith(
+      expect.not.objectContaining({ images: expect.anything() }),
+      expect.any(Object)
+    );
+  });
+
+  it('5MB 초과 이미지 → throw → ai_status=failed', async () => {
+    const { id } = repo.create({ rawText: 'big image' });
+    const mediaPath = join(workDir, 'media', id, '1.png');
+    await mkdir(join(workDir, 'media', id), { recursive: true });
+    await writeFile(mediaPath, Buffer.alloc(6 * 1024 * 1024));  // 6 MB
+    repo.insertMedia([{ noteId: id, kind: 'image', relPath: `media/${id}/1.png`, mime: 'image/png', bytes: 6 * 1024 * 1024 }]);
+
+    const generate = vi.fn(async () => ({ title: 't', summary: 's', tags: [], dueDate: null }));
+    const settings = {
+      getVisionModel: vi.fn(async () => 'gemma3:12b-vision'),
+      isAiEnabled: vi.fn(async () => true)
+    } as unknown as never;
+    const worker = new AiWorker(/* ... */);
+    await worker['processJob']({ noteId: id, attempts: 0, nextRunAt: '' });
+    // 5MB cap 초과 throw → AiWorker 의 attempts 증가 분기 → ai_status='failed'
+    const note = repo.findById(id);
+    expect(['failed', 'pending']).toContain(note!.aiStatus);  // attempts 모두 소진 시 'failed'; 첫 시도 throw 시 'pending' 유지 가능 — 구현 의존
+  });
+});
+```
+
+(NOTE: 정확한 AiWorker 생성자 인자 — 기존 test 의 setup 패턴 따라 deps 전체 stub 구성. 위 코드는 outline; 실수행자가 기존 `AiWorker.test.ts` setup 참고하여 정확한 deps 구조 채움.)
+
+- [ ] **Step 4: 기존 AiWorker.test.ts mock 갱신** — 생성자에 `settings` / `mediaStore` 파라미터 추가됨. 모든 기존 test 의 worker 생성 site 에 stub 추가.
+
+- [ ] **Step 5: PASS + commit**
+
+```bash
+npm run typecheck
+npx vitest run tests/unit/AiWorker.test.ts tests/unit/AiWorker.vision.test.ts
+git add src/main/ai/AiWorker.ts \
+        src/main/index.ts \
+        tests/unit/AiWorker.test.ts \
+        tests/unit/AiWorker.vision.test.ts
+git commit -m "feat(v031): AiWorker vision integration — note.media + visionModel + 5MB cap"
+```
+
+---
+
+## Task 6: types + IPC + preload
+
+**Files:**
+
+- Modify: `src/shared/types.ts` — `getSettings()` 반환에 vision_model / vision_capable_cache / vision_cache_at + InboxApi 3 메서드
+- Modify: `src/main/ipc/settingsApi.ts` — 3 IPC handler
+- Modify: `src/preload/index.ts` — 3 bridge
+- Create: `tests/unit/vision-ipc.test.ts`
+
+3 채널:
+- `settings:get-vision-models` → `{ models: string[]; at: string | null; selected: string | null }` (cache 결과 + 현재 선택)
+- `settings:set-vision-model` (value: string | null) → `{ ok: true }`
+- `settings:refresh-vision-cache` → `{ ok: true; models: string[] } | { ok: false; reason: string }` (refreshVisionCache 호출)
+
+상세 패턴은 Cut E sync IPC 와 동일.
+
+- [ ] **Step 1: types** + **Step 2: failing test** + **Step 3: handlers** + **Step 4: preload bridges** — Cut E sync-ipc 패턴 그대로
+
+- [ ] **Step 5: PASS + commit**
+
+```bash
+git add src/shared/types.ts src/main/ipc/settingsApi.ts src/preload/index.ts tests/unit/vision-ipc.test.ts
+git commit -m "feat(v031): vision IPC + preload (get-vision-models / set / refresh)"
+```
+
+---
+
+## Task 7: VisionSection UI + AI 제공자 섹션 통합
+
+**Files:**
+
+- Create: `src/renderer/inbox/components/settings/VisionSection.tsx`
+- Modify: `src/renderer/inbox/components/settings/AiProviderSection.tsx` 또는 SettingsPage — 마운트
+- Create: `tests/unit/VisionSection.test.tsx`
+
+dropdown (cache 기반) + 다시 감지 버튼 + 마지막 감지 시각 표시. dropdown 변경 시 `setVisionModel` 호출. 다시 감지 → `refreshVisionCache` IPC + dropdown 갱신.
+
+```tsx
+// 핵심 구조 (Cut E SyncSection 패턴)
+const [models, setModels] = useState<string[]>([]);
+const [at, setAt] = useState<string | null>(null);
+const [selected, setSelected] = useState<string | null>(null);
+const [busy, setBusy] = useState<'select' | 'refresh' | null>(null);
+
+useEffect(() => {
+  void (async () => {
+    const r = await inboxApi.getVisionModels();
+    setModels(r.models);
+    setAt(r.at);
+    setSelected(r.selected);
+  })();
+}, []);
+
+async function onSelect(value: string) {
+  setBusy('select');
+  await inboxApi.setVisionModel(value === '' ? null : value);
+  setSelected(value === '' ? null : value);
+  setBusy(null);
+}
+
+async function onRefresh() {
+  setBusy('refresh');
+  const r = await inboxApi.refreshVisionCache();
+  setBusy(null);
+  if (r.ok) {
+    const cache = await inboxApi.getVisionModels();
+    setModels(cache.models);
+    setAt(cache.at);
+  }
+}
+```
+
+UI:
+
+```tsx
+<select value={selected ?? ''} onChange={(e) => void onSelect(e.target.value)} aria-label="이미지 분석 모델">
+  <option value="">(비활성)</option>
+  {models.map((m) => <option key={m} value={m}>{m}</option>)}
+</select>
+<button onClick={() => void onRefresh()} disabled={busy === 'refresh'}>
+  {busy === 'refresh' ? '감지 중…' : '다시 감지'}
+</button>
+{at !== null && <span>마지막 감지: {new Date(at).toLocaleString('ko-KR')}</span>}
+```
+
+- [ ] **Step 1-5: 컴포넌트 + test + 마운트 + commit**
+
+```bash
+git add src/renderer/inbox/components/settings/VisionSection.tsx \
+        src/renderer/inbox/components/settings/AiProviderSection.tsx \
+        tests/unit/VisionSection.test.tsx
+git commit -m "feat(v031): VisionSection — dropdown + 다시 감지 + 마지막 감지 시각"
+```
+
+---
+
+## Task 8: main process — refreshVisionCache 자동 호출 + AiWorker settings 주입
+
+**Files:**
+
+- Modify: `src/main/index.ts`
+
+`whenReady` 안 (Ollama provider 준비 후) `void refreshVisionCache(...)` fire-and-forget 호출. AiWorker 생성자에 settings + mediaStore 주입.
+
+- [ ] **Step 1: imports + 호출** — `src/main/index.ts`:
+
+```ts
+import { refreshVisionCache } from './services/VisionDetect.js';
+
+// whenReady 안, AiWorker.start() 직후 또는 직전
+const ollama = providerHolder.get();
+void refreshVisionCache({
+  settings: settingsSvc,
+  endpoint: (ollama as LocalOllamaProvider).endpoint,  // 또는 SettingsService 의 ollama 설정에서 가져옴
+}).catch(() => {});
+```
+
+(LocalOllamaProvider 의 endpoint 가 private 이면 settings 에서 가져옴 또는 provider 에 getter 추가.)
+
+- [ ] **Step 2: AiWorker 생성자 인자 갱신**
+
+- [ ] **Step 3: typecheck + PASS + commit**
+
+```bash
+npm run typecheck
+npx vitest run
+git add src/main/index.ts
+git commit -m "feat(v031): main — refreshVisionCache whenReady + AiWorker settings/mediaStore 주입"
+```
+
+---
+
+## Task 9: dogfood promoted + version bump + release commit
+
+- [ ] F24 promoted 마킹 (`docs/superpowers/specs/2026-04-25-dogfood-feedback.md`):
+
+```markdown
+## F24. 멀티모달 vision (✅ promoted v0.3.1 Cut F)
+
+**상태:** ✅ promoted v0.3.1 Cut F — Ollama vision 모델 (gemma3 family default) 활용. capability detection (app launch + manual refresh) + Configure UI dropdown + AiWorker vision integration (5MB cap + base64 변환). 자동 fallback (caption → text) deferred v0.3.2+.
+```
+
+- [ ] package.json: 0.3.0 → 0.3.1 + package-lock.json
+- [ ] full unit + typecheck
+
+```bash
+git add docs/superpowers/specs/2026-04-25-dogfood-feedback.md package.json package-lock.json
+git commit -m "chore(release): v0.3.1 — Cut F (멀티모달 vision AI)"
+```
+
+---
+
+## Self-Review Checklist (수행자: 모든 task 완료 후 1회 점검)
+
+- [ ] **Spec coverage**: §3 Capability Detection (Task 1) / §3-2 SettingsService (Task 2) / §3-3 main wiring (Task 8) / §3-4 UI (Task 7) / §4 Provider (Tasks 3-4) / §5 AiWorker (Task 5) / §6 image-only fallback ('skipped' enum 미도입 → 기존 'failed' 분기 활용)
+- [ ] **Single write path 강제 (Cut C/D/E 정책)**: 본 cut 은 새 데이터 path 추가 없음 — `notes_fts` / `note_revisions` / `note_tags` mutation 없음 (vision 결과는 기존 `updateAiResult` path 활용 → 이미 검증됨). 회귀 검사 4-path invariant 유지.
+- [ ] **Type 일관성**: `GenerateInput.images` ↔ `GenerateOptions.visionModel` ↔ AiWorker 호출 ↔ LocalOllamaProvider body 모두 동일 shape
+- [ ] **단위 카운트**: VisionDetect 9 (5+4) + SettingsService 4 + visionPrompt 2 + LocalOllamaProvider 3 + AiWorker 3 + IPC 3-5 + UI 1 = 약 25-27 신규. 목표 22 달성
+
+---
+
+## Risk
+
+- **vision 모델 한국어 정확도**: gemma3 family 가 한국어 약하면 다른 family 추천 갱신 (메모리 정책). dogfood 검증 필요
+- **Ollama 가 vision images 무시 (모델 misclassify)**: capability detection false-positive — 사용자가 dropdown 에서 다른 모델 선택해 우회. 자동 fallback 미구현 (YAGNI)
+- **base64 메모리 폭주**: 5MB cap 적용. 다중 이미지 시 N×5MB = 메모리 누적 — vision 호출 후 image array 즉시 GC. 본 cut 의 dogfood 규모 (메모당 < 3 이미지) 무시
+- **capability detection 실패 silent**: 첫 launch 시 network 실패 → cache 빈 채로 진행. 사용자가 설정 페이지에서 "다시 감지" 클릭 → 직접 trigger 가능
+- **AiWorker 생성자 변경**: 기존 test 모두 mock 갱신 필요 (typecheck 가 catch). 누락 시 typecheck red
+- **F23 OFF (ai_enabled=false) 시 자동 OFF**: refreshVisionCache 가 ai_enabled 체크 → ai_disabled 분기. AiWorker 의 vision path 진입 자체가 ai_enabled=true 가정 — F23 OFF 시 vision path 미도달 (자명)
+- **e2e**: Cut C/D/E 와 동일 — 본 cut 미수행, main 머지 후 검증
--- a/docs/superpowers/specs/2026-04-25-dogfood-feedback.md
+++ b/docs/superpowers/specs/2026-04-25-dogfood-feedback.md
@@ -1787,9 +1787,9 @@ app.on('activate', () => {

 ---

-## F24. 이미지 멀티모달 AI 분석 (🌱 raw — v0.2.8/v0.3 후보, capability gated)
+## F24. 이미지 멀티모달 AI 분석 (✅ promoted v0.3.1 Cut F)

-**진행 상태:** 🌱 raw — Ollama vision 모델 (llava / llama3.2-vision / gemma3-multimodal 등) 활용. 사용자 표현: "가능할 경우만 하면 될 것 같다" — capability detection + opt-in 명시.
+**진행 상태:** ✅ promoted v0.3.1 Cut F — Ollama vision 모델 (gemma3 family default) 활용. capability detection (app launch + manual refresh) + Configure UI dropdown + AiWorker vision integration (5MB cap + base64 변환). 자동 fallback (caption → text) + 'skipped' enum deferred v0.3.2+. 단위 679 → 710. dogfood: vision 결과 정확도 + 한국어 token 정확도 검증.

 **발견:** 2026-05-09 v0.2.7 release 후 본인 dogfood. F22 (이미지 렌더링) + F23 (Ollama-less 모드) 와 강하게 연관.

--- a/docs/superpowers/specs/2026-05-09-v031-cut-f-design.md
+++ b/docs/superpowers/specs/2026-05-09-v031-cut-f-design.md
@@ -12,7 +12,7 @@

 ## 1. Cut 정체성

-Ollama vision 모델 (gemma3 family default) 활용 — 이미지 + raw_text 결합 prompt 또는 이미지 단독 분석 → title/summary/tags 자동 생성. F22 prerequisite (Cut A) 이미 완료.
+Ollama vision 모델 (gemma family — gemma3 / gemma4 default capable) 활용 — 이미지 + raw_text 결합 prompt 또는 이미지 단독 분석 → title/summary/tags 자동 생성. F22 prerequisite (Cut A) 이미 완료.

 ---

@@ -20,7 +20,7 @@ Ollama vision 모델 (gemma3 family default) 활용 — 이미지 + raw_text 결

 | 항목 | 결정 |
 |---|---|
-| **F24 default 모델** | gemma3 family (한국어 + 이미지 둘 다 강함, 본인 메모 `gemma4:e4b` 텍스트 모델과 같은 가족) |
+| **F24 default 모델** | gemma family — gemma3 / gemma4 둘 다 vision-capable hint (한국어 + 이미지 둘 다 강함, 본인 메모 `gemma4:e4b` 텍스트 모델과 같은 가족) |
 | **prompt 모드** | 단일 vision 모델 호출 (vision 모델이 텍스트도 처리). 모델 capability 부족 시 2단계 fallback (자동) |
 | **capability detection** | app launch 시 1회 + 설정 페이지 manual refresh 버튼 |
 | **F23 OFF 시 자동 OFF** | `ai_enabled=false` → vision 도 자동 OFF (자명) |
@@ -56,36 +56,59 @@ function isVisionCapable(model: { name: string; details?: { family?: string; fam
 }
 ```

-### 3-2. Settings storage
+### 3-2. Settings storage (실제 SettingsService API)
+
+zod schema 확장 (기존 ai_enabled / sync_* 와 동일 strict 패턴):

 ```ts
-interface SettingsSchema {
-  // ... 기존
-  vision_model?: string;       // 사용자 명시 모델 (빈 값 = 비활성)
-  vision_capable_cache?: string[];  // launch 시 detected 결과 cache
-  vision_cache_at?: string;    // ISO timestamp
-}
+const SettingsSchema = z.object({
+  // ... 기존 ollama / ai_enabled / onboarding_completed / sync_*
+  vision_model: z.string().nullable().optional(),
+  vision_capable_cache: z.array(z.string()).optional(),
+  vision_cache_at: z.string().optional()
+}).strict();
+```
+
+신규 SettingsService 메서드 (개별 setter/getter — `get/set` 일반화 X):
+
+```ts
+async getVisionModel(): Promise<string | null>;
+async setVisionModel(value: string | null): Promise<void>;
+async getVisionCapableCache(): Promise<{ models: string[]; at: string | null }>;
+async setVisionCapableCache(models: string[], now: Date): Promise<void>;
 ```

 ### 3-3. AppLaunchDetect

-```ts
-// src/main/index.ts whenReady 안 (settings 초기화 후)
-async function refreshVisionCache(): Promise<void> {
-  if (!settingsService.get('ai_enabled', true)) return;
-  try {
-    const tags = await fetch(`${endpoint}/api/tags`).then(r => r.json());
-    const capable = tags.models.filter(isVisionCapable).map((m: any) => m.name);
-    settingsService.set('vision_capable_cache', capable);
-    settingsService.set('vision_cache_at', new Date().toISOString());
-  } catch {
-    // network fail — silent, cache 유지
-  }
-}
+`src/main/services/VisionDetect.ts` 신규 — pure 함수 + 외부 fetch 주입 (테스트 가능):

-void refreshVisionCache();
+```ts
+export async function refreshVisionCache(deps: {
+  settings: SettingsService;
+  endpoint: string;
+  now?: () => Date;
+  fetchImpl?: typeof fetch;
+}): Promise<{ ok: true; models: string[] } | { ok: false; reason: string }> {
+  if (!(await deps.settings.isAiEnabled())) {
+    return { ok: false, reason: 'ai_disabled' };
+  }
+  const fetchFn = deps.fetchImpl ?? fetch;
+  let body: { models?: Array<{ name: string; details?: { family?: string; families?: string[] } }> };
+  try {
+    const r = await fetchFn(`${deps.endpoint}/api/tags`);
+    if (!r.ok) return { ok: false, reason: `tags http ${r.status}` };
+    body = await r.json();
+  } catch (e) {
+    return { ok: false, reason: `unreachable: ${(e as Error).message}` };
+  }
+  const capable = (body.models ?? []).filter(isVisionCapable).map((m) => m.name);
+  await deps.settings.setVisionCapableCache(capable, deps.now ? deps.now() : new Date());
+  return { ok: true, models: capable };
+}
 ```

+main process `whenReady` 안에서 fire-and-forget 호출. 실패 silent (cache 유지). settings:refresh-vision-cache IPC 가 동일 함수 호출 (manual "다시 감지" 버튼).
+
 ### 3-4. 설정 페이지 UI (AI 제공자 섹션 확장)

 ```
@@ -166,44 +189,53 @@ ${text || '(이미지만 있음)'}

 ---

-## 5. AiWorker 통합
+## 5. AiWorker 통합 (실제 API 정정)

-CaptureService 가 capture 시 image 첨부했으면 → notes.media 에 저장 + pending_jobs INSERT. AiWorker 가 job 처리 시:
+기존 `AiWorker.processJob` 이 `repo.findById(noteId)` 로 hydrate 된 `Note` 받음 — `note.media` 가 이미 join 결과로 채워져 있어 별도 `listMediaByNote` 호출 불필요. `MediaStore.absolutePath(relPath)` 로 디스크 path 추출.

 ```ts
-// src/main/ai/AiWorker.ts
-async processJob(noteId: string): Promise<void> {
-  const note = this.repo.getById(noteId);
-  const media = this.repo.listMediaByNote(noteId);
-  const visionModel = this.settings.get('vision_model');
+// src/main/ai/AiWorker.ts processJob 흐름
+const note = this.repo.findById(job.noteId);
+if (!note || ...) return;
+const visionModel = await this.settings.getVisionModel();

-  let images: Array<{ base64: string; mime: string }> | undefined;
-  if (visionModel && media.length > 0) {
-    images = await Promise.all(media.map(async (m) => ({
-      base64: (await fs.readFile(this.mediaStore.absolutePath(m.relPath))).toString('base64'),
-      mime: m.mime
-    })));
-  }
-
-  const provider = this.providerHolder.get();
-  const response = await provider.generate({ text: note.rawText, images, ... }, { visionModel });
-  // ... 기존 결과 적용
+let images: Array<{ base64: string; mime: string }> | undefined;
+if (visionModel && note.media.length > 0) {
+  images = await Promise.all(
+    note.media.map(async (m) => {
+      const buf = await readFile(this.mediaStore.absolutePath(m.relPath));
+      // 이미지당 5MB cap (base64 메모리 폭주 방지)
+      if (buf.byteLength > 5 * 1024 * 1024) {
+        throw new Error(`image ${m.relPath} exceeds 5MB cap`);
+      }
+      return { base64: buf.toString('base64'), mime: m.mime };
+    })
+  );
 }
+
+const res = await this.holder.get().generate({
+  text: note.rawText,
+  images,
+  todayKst,
+  dueDateCandidates: candidates,
+  vocab
+}, { visionModel });
 ```

-`media.length > 0 && visionModel` 둘 다 true 일 때만 vision path. 그 외는 기존 text-only.
+`visionModel && note.media.length > 0` 둘 다 true 일 때만 vision path. 그 외는 기존 text-only path 유지 (호환 보존). image 5MB cap 초과 시 throw → 기존 AiWorker 의 attempts 카운트 + ai_status='failed' 분기 활용.
+
+AiWorker 의 `settings: SettingsService` 의존성 추가 — 기존 생성자에 신규 파라미터.

 ---

-## 6. 이미지만 있는 capture
+## 6. 이미지만 있는 capture (정정 — 신규 enum 도입 X)

-`raw_text` 빈 값 + media 첨부만:
+`raw_text` 빈 값 + media 첨부만 케이스:

- 기존 동작: notes INSERT (raw_text=''), AiWorker 가 빈 prompt 로 호출 → ai_status='failed' 또는 무의미 응답
- vision enabled: AiWorker 가 vision prompt + images → 의미 있는 title/summary/tags 응답
- vision disabled (visionModel 빈 값): notes 저장만, ai_status='disabled' 신규 enum 활용 (Cut B 의 ai_enabled false 와 비슷한 의미 — 그러나 부분 disable, 즉 "이미지 only 라 처리 불가" 상태)
+- **vision enabled** (`visionModel` 설정 + media 있음): AiWorker 의 vision path → 의미 있는 title/summary/tags 응답
+- **vision disabled** (`visionModel` null): 기존 text-only 흐름 그대로 — 빈 prompt → AI 응답이 무의미하면 ai_status='failed' 분기 (재시도 가능). dogfood 시 빈도 측정 후 'skipped' enum 도입 여부 재평가.

-추천: vision disabled + image-only capture 시 `ai_status='skipped'` 신규 enum (Cut B 의 'disabled' 와 다름). title fallback = "(이미지 N개)" 또는 첫 이미지 파일명.
+**'skipped' 신규 enum 미도입 (YAGNI)**: m008 마이그레이션 (CHECK relax via table recreate) 부담 + 이미지-only capture 가 본 cut 의 main use case 가 아님. 사용자가 vision 활성 후 retry 하거나 raw_text 추가 후 reprocess 하는 우회로 충분. 정책 검토는 dogfood 후 별도 cut.

 ---

@@ -219,7 +251,7 @@ async processJob(noteId: string): Promise<void> {
 | `AiWorker.processJob` vision integration | media + visionModel 있을 때만 base64 변환 |
 | 이미지 only capture | raw_text='' + media → vision 결과 정상 또는 'skipped' 분기 |

-**목표**: 단위 555 → 약 575 (+20), typecheck 0.
+**목표**: 단위 679 → 약 701 (+22, isVisionCapable 5 + refreshVisionCache 4 + SettingsService vision 4 + LocalOllamaProvider vision path 3 + buildVisionPrompt 2 + AiWorker vision integration 3 + UI dropdown 1), typecheck 0.

 ---

@@ -231,7 +263,7 @@ async processJob(noteId: string): Promise<void> {
 | 이미지 base64 메모리 부담 | media 1개당 평균 < 1MB. 다중 이미지 시 N×base64 = 메모리 N배. cap (이미지당 max size 5MB) 적용 |
 | capability detection 실패 시 fallback | cache 부재 → vision dropdown 비어있음 표시 + "다시 감지" 안내 |
 | vision 모델 한국어 정확도 | dogfood 검증. gemma3 가 한국어 약하면 다른 family 추천 갱신 (메모리 정책 갱신) |
-| Ollama 가 vision images 필드 무시 (모델이 multimodal 미지원) | 자동 2단계 fallback — vision 모델로 caption 추출 → 텍스트 모델로 종합 (capability 부족 시) |
+| Ollama 가 vision images 필드 무시 (모델이 multimodal 미지원) | **본 cut 미구현 (YAGNI)** — 자동 2단계 fallback (caption 추출 → 텍스트 모델 종합) 은 v0.3.2+ 검토. dogfood 시 capability detection 정확도 우선 |

 ---

--- a/package-lock.json
+++ b/package-lock.json
@@ -1,12 +1,12 @@
 {
  "name": "inkling",
-  "version": "0.3.0",
+  "version": "0.3.1",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "inkling",
-      "version": "0.3.0",
+      "version": "0.3.1",
      "dependencies": {
        "better-sqlite3": "12.9.0",
        "electron-log": "5.2.0",
@@ -3232,7 +3232,7 @@
      }
    },
    "node_modules/@tokenizer/token": {
-      "version": "0.3.0",
+      "version": "0.3.1",
      "resolved": "https://registry.npmjs.org/@tokenizer/token/-/token-0.3.0.tgz",
      "integrity": "sha512-OvjF+z51L3ov0OyAU0duzsYuvO01PH7x4t6DJx+guahgTnBHkhJdG7soQeTSFLWN3efnHyibZ4Z8l2EuWwJN3A==",
      "dev": true,
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "inkling",
-  "version": "0.3.0",
+  "version": "0.3.1",
  "private": true,
  "description": "Inkling — local-first 한 줄 보관 도구",
  "author": "altair823 <dlsrks0734@gmail.com>",
--- a/src/main/ai/AiWorker.ts
+++ b/src/main/ai/AiWorker.ts
@@ -1,6 +1,9 @@
+import { readFile } from 'node:fs/promises';
 import type { NoteRepository } from '../repository/NoteRepository.js';
 import type { Note } from '@shared/types';
 import type { AiFailedReason } from '../services/telemetryEvents.js';
+import type { SettingsService } from '../services/SettingsService.js';
+import type { MediaStore } from '../services/MediaStore.js';
 import { ProviderHolder } from './ProviderHolder.js';
 import { parseAllCandidates } from '../services/dueDateParser.js';
 import { ZodError } from 'zod';
@@ -41,6 +44,10 @@ export interface AiWorkerOptions {
  };
  now?: () => Date;
  telemetry?: AiTelemetryEmitter;
+  /** v0.3.1 Cut F — vision 지원. 미전달 시 vision 비활성. */
+  settings?: Pick<SettingsService, 'getVisionModel'>;
+  /** v0.3.1 Cut F — 첨부 이미지 절대경로 변환. settings 와 함께 전달 시 vision 활성. */
+  mediaStore?: Pick<MediaStore, 'absolutePath'>;
 }

 interface Job { noteId: string; attempts: number; }
@@ -56,6 +63,8 @@ export class AiWorker {
  private logger: NonNullable<AiWorkerOptions['logger']>;
  private now: () => Date;
  private telemetry?: AiTelemetryEmitter;
+  private settings?: Pick<SettingsService, 'getVisionModel'>;
+  private mediaStore?: Pick<MediaStore, 'absolutePath'>;

  constructor(
    private repo: NoteRepository,
@@ -68,6 +77,8 @@ export class AiWorker {
    this.logger = opts.logger ?? { info: () => {}, warn: () => {}, error: () => {} };
    this.now = opts.now ?? (() => new Date());
    this.telemetry = opts.telemetry;
+    this.settings = opts.settings;
+    this.mediaStore = opts.mediaStore;
  }

  async enqueue(noteId: string): Promise<void> {
@@ -128,12 +139,27 @@ export class AiWorker {
        const todayIso = kstTodayIso(nowDate);
        const candidates = parseAllCandidates(note.rawText, todayDate);
        const vocab = this.repo.getTopUsedTags(VOCAB_TOP_N);
-        const res = await this.holder.get().generate({
-          text: note.rawText,
-          todayKst: todayIso,
-          dueDateCandidates: candidates,
-          vocab
-        });
+        // v0.3.1 Cut F — vision path: visionModel + note.media → base64 images
+        // final review fix: note.media[].bytes 로 fast-fail (readFile/base64 비용 회피).
+        // 5MB cap 초과 시 throw → AiWorker 의 'other' 분기 → markAiFailed 도달.
+        const visionModel = this.settings ? await this.settings.getVisionModel() : null;
+        let images: Array<{ base64: string; mime: string }> | undefined;
+        if (visionModel && note.media.length > 0 && this.mediaStore) {
+          const oversize = note.media.find((m) => m.bytes > 5 * 1024 * 1024);
+          if (oversize) {
+            throw new Error(`image ${oversize.relPath} exceeds 5MB cap (${oversize.bytes} bytes)`);
+          }
+          images = await Promise.all(
+            note.media.map(async (m) => {
+              const buf = await readFile(this.mediaStore!.absolutePath(m.relPath));
+              return { base64: buf.toString('base64'), mime: m.mime };
+            })
+          );
+        }
+        const res = await this.holder.get().generate(
+          { text: note.rawText, images, todayKst: todayIso, dueDateCandidates: candidates, vocab },
+          { visionModel: visionModel ?? undefined }
+        );
        // AI primary: AI's dueDate is final (no rule merge)
        this.repo.updateAiResult(job.noteId, {
          title: res.title,
--- a/src/main/ai/InferenceProvider.ts
+++ b/src/main/ai/InferenceProvider.ts
@@ -6,13 +6,20 @@ export interface GenerateInput {
  todayKst: string; // ISO YYYY-MM-DD in KST
  dueDateCandidates: ParseResult[];
  vocab?: string[]; // v0.2.3 #3 — top-N kebab-case 태그. 미전달 시 빈 배열로 처리.
+  // v0.3.1 Cut F — 첨부 이미지. 미전달 시 텍스트 전용 처리.
+  images?: Array<{ base64: string; mime: string }>;
+}
+
+export interface GenerateOptions {
+  /** v0.3.1 Cut F — vision 전용 model 지정. null/미전달 시 기본 model 사용. */
+  visionModel?: string | null;
 }

 export interface HealthResult { ok: boolean; model?: string; reason?: string; }

 export interface InferenceProvider {
  readonly name: string;
-  generate(input: GenerateInput): Promise<AiResponse>;
+  generate(input: GenerateInput, opts?: GenerateOptions): Promise<AiResponse>;
  healthCheck(): Promise<HealthResult>;
  /** v0.2.3.1 — 외부에서 in-flight generate 강제 중단. ProviderHolder.replace 시 사용. */
  abort?: () => void;
--- a/src/main/ai/LocalOllamaProvider.ts
+++ b/src/main/ai/LocalOllamaProvider.ts
@@ -1,7 +1,8 @@
 import { request } from 'undici';
 import { parseAiResponse, type AiResponse } from './schema.js';
 import { buildPrompt } from './prompt.js';
-import type { GenerateInput, HealthResult, InferenceProvider } from './InferenceProvider.js';
+import { buildVisionPrompt } from './visionPrompt.js';
+import type { GenerateInput, GenerateOptions, HealthResult, InferenceProvider } from './InferenceProvider.js';
 import { DEFAULT_OLLAMA_ENDPOINT, DEFAULT_OLLAMA_MODEL } from '../../shared/constants.js';

 export interface LocalOllamaOptions {
@@ -30,29 +31,39 @@ export class LocalOllamaProvider implements InferenceProvider {
    this.name = `local-ollama/${this.model}`;
  }

-  async generate(input: GenerateInput): Promise<AiResponse> {
+  async generate(input: GenerateInput, opts?: GenerateOptions): Promise<AiResponse> {
+    const useVision = !!opts?.visionModel && (input.images?.length ?? 0) > 0;
+    const model = useVision ? opts!.visionModel! : this.model;
+    const prompt = useVision
+      ? buildVisionPrompt(input.text, input.todayKst, input.dueDateCandidates.map((c) => c.iso ?? c.matchedToken ?? ''), input.vocab ?? [])
+      : buildPrompt(input.text, input.todayKst, input.dueDateCandidates, input.vocab ?? []);
+
    this.abortController = new AbortController();
    const timer = setTimeout(() => this.abortController?.abort(), this.timeoutMs);
    try {
+      const body: Record<string, unknown> = {
+        model,
+        prompt,
+        format: 'json',
+        stream: false,
+        options: { temperature: this.temperature, num_predict: this.numPredict }
+      };
+      if (useVision) {
+        body.images = input.images!.map((i) => i.base64);
+      }
      const res = await request(`${this.endpoint}/api/generate`, {
        method: 'POST',
        headers: { 'content-type': 'application/json' },
-        body: JSON.stringify({
-          model: this.model,
-          prompt: buildPrompt(input.text, input.todayKst, input.dueDateCandidates, input.vocab ?? []),
-          format: 'json',
-          stream: false,
-          options: { temperature: this.temperature, num_predict: this.numPredict }
-        }),
+        body: JSON.stringify(body),
        signal: this.abortController.signal
      });
      if (res.statusCode < 200 || res.statusCode >= 300) {
        throw new Error(`ollama http ${res.statusCode}`);
      }
-      const body = (await res.body.json()) as { response?: string };
-      if (!body.response) throw new Error('missing response field');
+      const responseBody = (await res.body.json()) as { response?: string };
+      if (!responseBody.response) throw new Error('missing response field');
      let parsed: unknown;
-      try { parsed = JSON.parse(body.response); }
+      try { parsed = JSON.parse(responseBody.response); }
      catch (err) { throw new Error(`invalid json in response: ${String(err)}`); }
      return parseAiResponse(parsed);
    } finally {
--- a/src/main/ai/visionPrompt.ts
+++ b/src/main/ai/visionPrompt.ts
@@ -0,0 +1,17 @@
+export function buildVisionPrompt(
+  text: string,
+  todayKst: string,
+  dueCandidates: string[],
+  vocab: string[]
+): string {
+  return `다음 메모와 첨부 이미지를 종합 분석해 한국어로 요약하세요.
+
+메모 본문 (비어 있을 수 있음):
+${text || '(이미지만 있음)'}
+
+이미지 분석 시 주요 시각적 정보 (텍스트, 사람, 장면) 도 포함해 요약하세요.
+출력 JSON: { "title": "...", "summary": "...", "tags": [...], "due_date": "..." }
+오늘: ${todayKst}
+가능한 due 후보: ${dueCandidates.join(', ')}
+빈출 태그: ${vocab.slice(0, 20).join(', ')}`;
+}
--- a/src/main/index.ts
+++ b/src/main/index.ts
@@ -17,6 +17,7 @@ import { HealthChecker } from './services/HealthChecker.js';
 import { LocalOllamaProvider } from './ai/LocalOllamaProvider.js';
 import { ProviderHolder } from './ai/ProviderHolder.js';
 import { AiWorker } from './ai/AiWorker.js';
+import { refreshVisionCache } from './services/VisionDetect.js';
 import { registerCaptureApi } from './ipc/captureApi.js';
 import { registerInboxApi, pushNoteUpdated, pushOllamaStatus } from './ipc/inboxApi.js';
 import { registerSettingsApi, navigateInbox } from './ipc/settingsApi.js';
@@ -122,6 +123,11 @@ app.whenReady().then(async () => {

  const provider = new LocalOllamaProvider({ endpoint: resolvedEndpoint, model: resolvedModel });
  const providerHolder = new ProviderHolder(provider);
+
+  // v0.3.1 Cut F — app launch 시 vision capability cache 갱신 (fire-and-forget).
+  // 실패 silent (cache 유지). 사용자가 설정 페이지에서 "다시 감지" manual trigger 가능.
+  void refreshVisionCache({ settings: settingsSvc, endpoint: resolvedEndpoint }).catch(() => {});
+
  const health = new HealthChecker(providerHolder, {
    // v0.2.9 Cut B Task 14 — AI 비활성 시 health polling skip (Ollama 미설치 환경 무영향).
    isAiEnabled: () => settingsSvc.isAiEnabled(),
@@ -149,7 +155,10 @@ app.whenReady().then(async () => {
      refreshTray({ todayCount: repo.countToday() });
    },
    logger,
-    telemetry
+    telemetry,
+    // v0.3.1 Cut F — vision 지원
+    settings: settingsSvc,
+    mediaStore: store
  });

  const notify = new NotificationService({
--- a/src/main/ipc/settingsApi.ts
+++ b/src/main/ipc/settingsApi.ts
@@ -13,6 +13,8 @@ import type { SettingsService } from '../services/SettingsService.js';
 import type { SyncTimer } from '../services/SyncTimer.js';
 import { collectAutostartState } from '../services/AutostartDiagnostic.js';
 import { getInboxWindow as getInboxWindowSingleton } from '../windows/inboxWindow.js';
+import { refreshVisionCache } from '../services/VisionDetect.js';
+import { DEFAULT_OLLAMA_ENDPOINT } from '../../shared/constants.js';

 /**
 * 외부 (트레이 / second-instance / 기타 main 프로세스 호출자) 에서 inbox 창에 view 전환을
@@ -378,4 +380,29 @@ export function registerSettingsApi(deps?: SettingsIpcDeps): void {
    }
    return { lastAt: last.lastAt, lastResult: last.lastResult, nextAt };
  });
+
+  // v0.3.1 Cut F — vision IPC
+
+  ipcMain.handle('settings:get-vision-models', async () => {
+    const cache = await deps.settings.getVisionCapableCache();
+    const selected = await deps.settings.getVisionModel();
+    return { models: cache.models, at: cache.at, selected };
+  });
+
+  ipcMain.handle('settings:set-vision-model', async (_e, value: string | null) => {
+    const sanitized = typeof value === 'string' && value.trim().length > 0 ? value.trim() : null;
+    await deps.settings.setVisionModel(sanitized);
+    return { ok: true as const };
+  });
+
+  ipcMain.handle('settings:refresh-vision-cache', async () => {
+    // Cut F final review fix — index.ts 의 resolvedEndpoint (settings → env → default)
+    // 와 동일한 fallback 체인 사용. settings.ollama 미설정 + env / default 만 있는 dev
+    // 환경에서도 manual "다시 감지" 가 동작하도록.
+    const all = await deps.settings.getAll();
+    const endpoint = all.ollama?.endpoint
+                  ?? process.env.INKLING_OLLAMA_ENDPOINT
+                  ?? DEFAULT_OLLAMA_ENDPOINT;
+    return refreshVisionCache({ settings: deps.settings, endpoint });
+  });
 }
--- a/src/main/services/SettingsService.ts
+++ b/src/main/services/SettingsService.ts
@@ -17,7 +17,11 @@ const SettingsSchema = z.object({
  // v0.3.0 Cut E — 양방향 git sync 설정. 모두 optional — 미구성 시 sync 비활성.
  sync_repo_url: z.string().nullable().optional(),
  sync_auto_enabled: z.boolean().optional(),
-  sync_interval_min: z.number().int().min(5).optional()
+  sync_interval_min: z.number().int().min(5).optional(),
+  // v0.3.1 Cut F
+  vision_model: z.string().nullable().optional(),
+  vision_capable_cache: z.array(z.string()).optional(),
+  vision_cache_at: z.string().optional()
 }).strict();

 export type Settings = z.infer<typeof SettingsSchema>;
@@ -127,6 +131,30 @@ export class SettingsService {
    await this.persist(next);
  }

+  /** v0.3.1 Cut F — 선택된 vision model. null = 미선택. */
+  async getVisionModel(): Promise<string | null> {
+    const s = await this.load();
+    return s.vision_model ?? null;
+  }
+
+  async setVisionModel(value: string | null): Promise<void> {
+    const current = await this.load();
+    const next: Settings = { ...current, vision_model: value };
+    await this.persist(next);
+  }
+
+  /** v0.3.1 Cut F — /api/tags 조회 결과 캐시. 기본 빈 배열 + null timestamp. */
+  async getVisionCapableCache(): Promise<{ models: string[]; at: string | null }> {
+    const s = await this.load();
+    return { models: s.vision_capable_cache ?? [], at: s.vision_cache_at ?? null };
+  }
+
+  async setVisionCapableCache(models: string[], now: Date): Promise<void> {
+    const current = await this.load();
+    const next: Settings = { ...current, vision_capable_cache: models, vision_cache_at: now.toISOString() };
+    await this.persist(next);
+  }
+
  private async persist(next: Settings): Promise<void> {
    await mkdir(dirname(this.filePath), { recursive: true });
    const tmpPath = this.filePath + '.tmp';
--- a/src/main/services/VisionDetect.ts
+++ b/src/main/services/VisionDetect.ts
@@ -0,0 +1,47 @@
+import type { SettingsService } from './SettingsService.js';
+
+// v0.3.1 Cut F final fix — gemma 시리즈 default 정정. 본인 dogfood 환경 = gemma4:e4b
+// (텍스트). vision 변종은 gemma3 (현재 vision-capable) 또는 gemma4 (향후 출시 시).
+// 양 family 모두 hint 에 포함 — capability detection 이 future-proof.
+const VISION_FAMILIES = new Set(['gemma3', 'gemma4', 'llava', 'llama3.2-vision', 'minicpm-v', 'pixtral']);
+const VISION_NAME_HINTS = ['vision', 'vl', 'multimodal', 'gemma3', 'gemma4'];
+
+export interface OllamaModel {
+  name: string;
+  details?: { family?: string; families?: string[] };
+}
+
+export function isVisionCapable(model: OllamaModel): boolean {
+  if (model.details?.family && VISION_FAMILIES.has(model.details.family)) return true;
+  if (model.details?.families?.some((f) => VISION_FAMILIES.has(f))) return true;
+  const lower = model.name.toLowerCase();
+  return VISION_NAME_HINTS.some((h) => lower.includes(h));
+}
+
+export interface RefreshDeps {
+  settings: SettingsService;
+  endpoint: string;
+  now?: () => Date;
+  fetchImpl?: typeof fetch;
+}
+
+export async function refreshVisionCache(
+  deps: RefreshDeps
+): Promise<{ ok: true; models: string[] } | { ok: false; reason: string }> {
+  if (!(await deps.settings.isAiEnabled())) {
+    return { ok: false, reason: 'ai_disabled' };
+  }
+  const fetchFn = deps.fetchImpl ?? fetch;
+  let body: { models?: OllamaModel[] };
+  try {
+    const r = await fetchFn(`${deps.endpoint}/api/tags`);
+    if (!r.ok) return { ok: false, reason: `tags http ${r.status}` };
+    body = (await r.json()) as { models?: OllamaModel[] };
+  } catch (e) {
+    return { ok: false, reason: `unreachable: ${(e as Error).message}` };
+  }
+  const capable = (body.models ?? []).filter(isVisionCapable).map((m) => m.name);
+  const now = deps.now ? deps.now() : new Date();
+  await deps.settings.setVisionCapableCache(capable, now);
+  return { ok: true, models: capable };
+}
--- a/src/preload/index.ts
+++ b/src/preload/index.ts
@@ -97,6 +97,10 @@ const api: InklingApi = {
    getSyncStatus: () => ipcRenderer.invoke('sync:get-status'),
    setSyncAutoEnabled: (value: boolean) => ipcRenderer.invoke('settings:set-sync-auto-enabled', value),
    setSyncIntervalMin: (value: number) => ipcRenderer.invoke('settings:set-sync-interval-min', value),
+    // v0.3.1 Cut F — vision capability + 모델 선택
+    getVisionModels: () => ipcRenderer.invoke('settings:get-vision-models'),
+    setVisionModel: (value: string | null) => ipcRenderer.invoke('settings:set-vision-model', value),
+    refreshVisionCache: () => ipcRenderer.invoke('settings:refresh-vision-cache'),
  }
 };

--- a/src/renderer/inbox/components/settings/AiProviderSection.tsx
+++ b/src/renderer/inbox/components/settings/AiProviderSection.tsx
@@ -1,6 +1,7 @@
 import React, { useEffect, useState } from 'react';
 import { z } from 'zod';
 import { inboxApi } from '../../api.js';
+import { VisionSection } from './VisionSection.js';

 const endpointSchema = z.string().url();

@@ -192,6 +193,7 @@ export function AiProviderSection(): React.ReactElement {
      {recheckResult && (
        <div style={{ fontSize: 12, marginTop: 8 }}>{recheckResult}</div>
      )}
+      <VisionSection />
    </div>
  );
 }
--- a/src/renderer/inbox/components/settings/VisionSection.tsx
+++ b/src/renderer/inbox/components/settings/VisionSection.tsx
@@ -0,0 +1,81 @@
+import React, { useEffect, useState } from 'react';
+import { inboxApi } from '../../api.js';
+
+export function VisionSection(): React.ReactElement {
+  const [models, setModels] = useState<string[]>([]);
+  const [at, setAt] = useState<string | null>(null);
+  const [selected, setSelected] = useState<string | null>(null);
+  const [busy, setBusy] = useState<'select' | 'refresh' | null>(null);
+  const [feedback, setFeedback] = useState<string | null>(null);
+
+  async function load() {
+    const r = await inboxApi.getVisionModels();
+    setModels(r.models);
+    setAt(r.at);
+    setSelected(r.selected);
+  }
+
+  useEffect(() => {
+    void load();
+  }, []);
+
+  async function onSelect(value: string) {
+    const next = value === '' ? null : value;
+    setBusy('select');
+    setFeedback(null);
+    await inboxApi.setVisionModel(next);
+    setSelected(next);
+    setBusy(null);
+  }
+
+  async function onRefresh() {
+    setBusy('refresh');
+    setFeedback(null);
+    const r = await inboxApi.refreshVisionCache();
+    setBusy(null);
+    if (r.ok) {
+      await load();
+      setFeedback(`감지 완료 (${r.models.length}개)`);
+    } else {
+      setFeedback(`감지 실패: ${r.reason}`);
+    }
+  }
+
+  return (
+    <section style={{ marginTop: 16 }}>
+      <h4 style={{ fontSize: 13, marginBottom: 6 }}>이미지 분석 모델 (선택사항)</h4>
+      <div style={{ display: 'flex', gap: 6, alignItems: 'center', marginBottom: 6 }}>
+        <select
+          aria-label="이미지 분석 모델"
+          value={selected ?? ''}
+          onChange={(e) => { void onSelect(e.target.value); }}
+          disabled={busy !== null}
+          style={{ flex: 1, fontSize: 12, padding: '4px 8px', border: '1px solid #ccc', borderRadius: 4 }}
+        >
+          <option value="">(비활성)</option>
+          {models.map((m) => <option key={m} value={m}>{m}</option>)}
+        </select>
+        <button
+          onClick={() => { void onRefresh(); }}
+          disabled={busy !== null}
+          style={{ background: '#0a4b80', color: '#fff', border: 'none', cursor: 'pointer', fontSize: 12, padding: '4px 10px', borderRadius: 4 }}
+        >
+          {busy === 'refresh' ? '감지 중…' : '다시 감지'}
+        </button>
+      </div>
+      {at !== null && (
+        <div style={{ fontSize: 11, color: '#888' }}>
+          마지막 감지: {new Date(at).toLocaleString('ko-KR')}
+        </div>
+      )}
+      {feedback !== null && (
+        <div style={{ fontSize: 11, color: '#444', marginTop: 4 }}>{feedback}</div>
+      )}
+      {models.length === 0 && (
+        <div style={{ fontSize: 11, color: '#aaa', marginTop: 4 }}>
+          감지된 모델 없음. Ollama 에 vision 모델을 설치하고 "다시 감지" 클릭.
+        </div>
+      )}
+    </section>
+  );
+}
--- a/src/shared/types.ts
+++ b/src/shared/types.ts
@@ -197,6 +197,10 @@ export interface InboxApi {
    sync_repo_url?: string | null;
    sync_auto_enabled?: boolean;
    sync_interval_min?: number;
+    // v0.3.1 Cut F
+    vision_model?: string | null;
+    vision_capable_cache?: string[];
+    vision_cache_at?: string;
  }>;
  setAiEnabled(enabled: boolean): Promise<{ ok: true }>;
  setOnboardingCompleted(completed: boolean): Promise<{ ok: true }>;
@@ -218,6 +222,10 @@ export interface InboxApi {
  getSyncStatus(): Promise<SyncStatusSnapshot>;
  setSyncAutoEnabled(enabled: boolean): Promise<{ ok: true }>;
  setSyncIntervalMin(value: number): Promise<{ ok: true } | { ok: false; reason: string }>;
+  // v0.3.1 Cut F — vision capability detection + 모델 선택.
+  getVisionModels(): Promise<{ models: string[]; at: string | null; selected: string | null }>;
+  setVisionModel(value: string | null): Promise<{ ok: true }>;
+  refreshVisionCache(): Promise<{ ok: true; models: string[] } | { ok: false; reason: string }>;
 }

 export interface InklingApi {
--- a/tests/unit/AiProviderSection.test.tsx
+++ b/tests/unit/AiProviderSection.test.tsx
@@ -11,7 +11,11 @@ vi.mock('../../src/renderer/inbox/api.js', () => ({
    getSettings: vi.fn(async () => ({ ai_enabled: true })),
    setAiEnabled: vi.fn(async () => ({ ok: true })),
    getDisabledCount: vi.fn(async () => 0),
-    enqueueDisabled: vi.fn(async () => ({ count: 0 }))
+    enqueueDisabled: vi.fn(async () => ({ count: 0 })),
+    // v0.3.1 Cut F — VisionSection 이 AiProviderSection 에 마운트되어 호출.
+    getVisionModels: vi.fn(async () => ({ models: [], at: null, selected: null })),
+    setVisionModel: vi.fn(async () => ({ ok: true as const })),
+    refreshVisionCache: vi.fn(async () => ({ ok: true as const, models: [] }))
  }
 }));

--- a/tests/unit/AiWorker.test.ts
+++ b/tests/unit/AiWorker.test.ts
@@ -449,9 +449,10 @@ describe('AiWorker — vocab fetch + per-tag hit/miss (v0.2.3 #3 T7)', () => {
    });
    await w.enqueue(id);
    await w.drain();
-    expect(generateMock).toHaveBeenCalledWith(expect.objectContaining({
-      vocab: expect.arrayContaining(['design'])
-    }));
+    expect(generateMock).toHaveBeenCalledWith(
+      expect.objectContaining({ vocab: expect.arrayContaining(['design']) }),
+      expect.anything()
+    );
  });

  it('emits tag_vocab_hit for vocab tags + tag_vocab_miss for new tags', async () => {
--- a/tests/unit/AiWorker.vision.test.ts
+++ b/tests/unit/AiWorker.vision.test.ts
@@ -0,0 +1,125 @@
+import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
+import { writeFile, mkdtemp, mkdir, rm } from 'node:fs/promises';
+import { tmpdir } from 'node:os';
+import { join } from 'node:path';
+import Database from 'better-sqlite3';
+import { runMigrations } from '@main/db/migrations/index.js';
+import { NoteRepository } from '@main/repository/NoteRepository.js';
+import { AiWorker } from '@main/ai/AiWorker.js';
+import { ProviderHolder } from '@main/ai/ProviderHolder.js';
+import { MediaStore } from '@main/services/MediaStore.js';
+import type { AiResponse } from '@main/ai/schema.js';
+import type { InferenceProvider } from '@main/ai/InferenceProvider.js';
+
+describe('AiWorker — vision path (v0.3.1 Cut F)', () => {
+  let db: Database.Database;
+  let repo: NoteRepository;
+  let workDir: string;
+  let mediaStore: MediaStore;
+
+  beforeEach(async () => {
+    db = new Database(':memory:');
+    db.pragma('foreign_keys = ON');
+    runMigrations(db);
+    repo = new NoteRepository(db);
+    workDir = await mkdtemp(join(tmpdir(), 'inkling-vision-'));
+    mediaStore = new MediaStore(workDir);
+  });
+
+  afterEach(async () => {
+    db.close();
+    await rm(workDir, { recursive: true, force: true });
+  });
+
+  function makeWorker(
+    generate: (input: Parameters<InferenceProvider['generate']>[0], opts?: Parameters<InferenceProvider['generate']>[1]) => Promise<AiResponse>,
+    getVisionModel: () => Promise<string | null>
+  ): AiWorker {
+    const provider: InferenceProvider = {
+      name: 'fake',
+      generate,
+      abort: () => {},
+      healthCheck: vi.fn(async () => ({ ok: true }))
+    };
+    const holder = new ProviderHolder(provider);
+    const settings = { getVisionModel };
+    const logger = { info: vi.fn(), warn: vi.fn(), error: vi.fn() };
+    return new AiWorker(repo, holder, {
+      backoffsMs: [0, 0, 0],
+      logger,
+      settings,
+      mediaStore,
+      now: () => new Date('2026-05-10T05:00:00Z')
+    });
+  }
+
+  it('visionModel + media 있음 → provider.generate 가 images + opts 받음', async () => {
+    const { id } = repo.create({ rawText: '이미지 메모' });
+    await mkdir(join(workDir, 'media', id), { recursive: true });
+    await writeFile(join(workDir, 'media', id, '1.png'), Buffer.from([0x89, 0x50, 0x4e, 0x47]));
+    repo.insertMedia([{ noteId: id, kind: 'image', relPath: `media/${id}/1.png`, mime: 'image/png', bytes: 4 }]);
+
+    const calls: Array<Parameters<InferenceProvider['generate']>> = [];
+    const generate = vi.fn(async (
+      input: Parameters<InferenceProvider['generate']>[0],
+      opts?: Parameters<InferenceProvider['generate']>[1]
+    ): Promise<AiResponse> => {
+      calls.push([input, opts]);
+      return { title: 't', summary: 'a\nb\nc', tags: [], dueDate: null };
+    });
+    const getVisionModel = vi.fn(async (): Promise<string | null> => 'gemma3:12b-vision');
+    const worker = makeWorker(generate, getVisionModel);
+    await worker.enqueue(id);
+    await worker.drain();
+
+    expect(calls.length).toBeGreaterThan(0);
+    const [callInput, callOpts] = calls[0]!;
+    expect(callInput.images).toHaveLength(1);
+    expect(callInput.images![0]!.mime).toBe('image/png');
+    expect(callOpts?.visionModel).toBe('gemma3:12b-vision');
+  });
+
+  it('visionModel null이면 text-only (images undefined)', async () => {
+    const { id } = repo.create({ rawText: 'just text' });
+    const calls: Array<Parameters<InferenceProvider['generate']>> = [];
+    const generate = vi.fn(async (
+      input: Parameters<InferenceProvider['generate']>[0],
+      opts?: Parameters<InferenceProvider['generate']>[1]
+    ): Promise<AiResponse> => {
+      calls.push([input, opts]);
+      return { title: 't', summary: 'a\nb\nc', tags: [], dueDate: null };
+    });
+    const getVisionModel = vi.fn(async (): Promise<string | null> => null);
+    const worker = makeWorker(generate, getVisionModel);
+    await worker.enqueue(id);
+    await worker.drain();
+
+    expect(calls.length).toBeGreaterThan(0);
+    expect(calls[0]![0].images).toBeUndefined();
+  });
+
+  it('5MB 초과 이미지 → throw → AiWorker 의 fail 분기 (generate 미호출)', async () => {
+    const { id } = repo.create({ rawText: 'big image' });
+    await mkdir(join(workDir, 'media', id), { recursive: true });
+    await writeFile(join(workDir, 'media', id, '1.png'), Buffer.alloc(6 * 1024 * 1024));
+    repo.insertMedia([{ noteId: id, kind: 'image', relPath: `media/${id}/1.png`, mime: 'image/png', bytes: 6 * 1024 * 1024 }]);
+
+    const calls: Array<Parameters<InferenceProvider['generate']>> = [];
+    const generate = vi.fn(async (
+      input: Parameters<InferenceProvider['generate']>[0],
+      opts?: Parameters<InferenceProvider['generate']>[1]
+    ): Promise<AiResponse> => {
+      calls.push([input, opts]);
+      return { title: 't', summary: 'a\nb\nc', tags: [], dueDate: null };
+    });
+    const getVisionModel = vi.fn(async (): Promise<string | null> => 'gemma3:12b-vision');
+    const worker = makeWorker(generate, getVisionModel);
+    await worker.enqueue(id);
+    await worker.drain();
+
+    expect(calls.length).toBe(0);
+    // AiWorker catch 분기가 처리 — note 는 여전히 DB 에 존재
+    const note = repo.findById(id);
+    expect(note).toBeTruthy();
+  });
+});
--- a/tests/unit/App.test.tsx
+++ b/tests/unit/App.test.tsx
@@ -62,7 +62,11 @@ vi.mock('../../src/renderer/inbox/api.js', () => ({
    setSyncAutoEnabled: vi.fn(async () => ({ ok: true as const })),
    setSyncIntervalMin: vi.fn(async () => ({ ok: true as const })),
    configureSync: vi.fn(async () => ({ ok: true as const })),
-    testSyncConnection: vi.fn(async () => ({ ok: true as const }))
+    testSyncConnection: vi.fn(async () => ({ ok: true as const })),
+    // v0.3.1 Cut F — VisionSection 이 AiProviderSection 에 마운트되어 호출.
+    getVisionModels: vi.fn(async () => ({ models: [], at: null, selected: null })),
+    setVisionModel: vi.fn(async () => ({ ok: true as const })),
+    refreshVisionCache: vi.fn(async () => ({ ok: true as const, models: [] }))
  }
 }));

--- a/tests/unit/ImportService.test.ts
+++ b/tests/unit/ImportService.test.ts
@@ -34,6 +34,12 @@ function buildExportNote(overrides: Partial<ExportNote> = {}): ExportNote {
    aiGeneratedAt: '2026-04-25T14:23:34.000Z',
    userIntent: null,
    intentPromptedAt: null,
+    // v0.3.0 Cut E — frontmatter round-trip 5 필드 (Cut B status + Cut C dueDate).
+    status: 'active',
+    statusChangedAt: null,
+    moveReason: null,
+    dueDate: null,
+    dueDateEditedByUser: false,
    tags: [{ name: 'pr', source: 'ai' }],
    media: [],
    ...overrides
--- a/tests/unit/LocalOllamaProvider.test.ts
+++ b/tests/unit/LocalOllamaProvider.test.ts
@@ -109,4 +109,58 @@ describe('LocalOllamaProvider', () => {
    const provider = new LocalOllamaProvider({ model: 'gemma4:26b' });
    expect(provider.name).toBe('local-ollama/gemma4:26b');
  });
+
+  describe('vision path (v0.3.1 Cut F)', () => {
+    it('visionModel + images → body.images + model=visionModel + buildVisionPrompt', async () => {
+      let capturedBody: string = '';
+      mock.get('http://x').intercept({ path: '/api/generate', method: 'POST' }).reply((opts) => {
+        capturedBody = opts.body as string;
+        return { statusCode: 200, data: JSON.stringify({
+          response: JSON.stringify({ title: '비전테스트', summary: 'a\nb\nc', tags: [], due_date: null })
+        }) };
+      });
+      const provider = new LocalOllamaProvider({ endpoint: 'http://x', model: 'gemma4:e4b' });
+      await provider.generate(
+        { text: 'hi', todayKst: '2026-05-10', dueDateCandidates: [], images: [{ base64: 'AAAA', mime: 'image/png' }] },
+        { visionModel: 'gemma3:12b-vision' }
+      );
+      const parsed = JSON.parse(capturedBody) as { model: string; prompt: string; images?: string[] };
+      expect(parsed.model).toBe('gemma3:12b-vision');
+      expect(parsed.prompt).toContain('이미지');
+      expect(parsed.images).toEqual(['AAAA']);
+    });
+
+    it('visionModel 있어도 images 없으면 text-only (model = this.model, no body.images)', async () => {
+      let capturedBody: string = '';
+      mock.get('http://x').intercept({ path: '/api/generate', method: 'POST' }).reply((opts) => {
+        capturedBody = opts.body as string;
+        return { statusCode: 200, data: JSON.stringify({
+          response: JSON.stringify({ title: '텍스트전용', summary: 'a\nb\nc', tags: [], due_date: null })
+        }) };
+      });
+      const provider = new LocalOllamaProvider({ endpoint: 'http://x', model: 'gemma4:e4b' });
+      await provider.generate(
+        { text: 'hi', todayKst: '2026-05-10', dueDateCandidates: [] },
+        { visionModel: 'gemma3:12b-vision' }
+      );
+      const parsed = JSON.parse(capturedBody) as { model: string; images?: string[] };
+      expect(parsed.model).toBe('gemma4:e4b');
+      expect(parsed.images).toBeUndefined();
+    });
+
+    it('opts 미전달 → 기존 text-only (회귀)', async () => {
+      let capturedBody: string = '';
+      mock.get('http://x').intercept({ path: '/api/generate', method: 'POST' }).reply((opts) => {
+        capturedBody = opts.body as string;
+        return { statusCode: 200, data: JSON.stringify({
+          response: JSON.stringify({ title: '기본텍스트', summary: 'a\nb\nc', tags: [], due_date: null })
+        }) };
+      });
+      const provider = new LocalOllamaProvider({ endpoint: 'http://x', model: 'gemma4:e4b' });
+      await provider.generate({ text: 'hi', todayKst: '2026-05-10', dueDateCandidates: [] });
+      const parsed = JSON.parse(capturedBody) as { model: string; images?: string[] };
+      expect(parsed.model).toBe('gemma4:e4b');
+      expect(parsed.images).toBeUndefined();
+    });
+  });
 });
--- a/tests/unit/SettingsPage.test.tsx
+++ b/tests/unit/SettingsPage.test.tsx
@@ -52,7 +52,11 @@ vi.mock('../../src/renderer/inbox/api.js', () => ({
    setSyncAutoEnabled: vi.fn(async () => ({ ok: true as const })),
    setSyncIntervalMin: vi.fn(async () => ({ ok: true as const })),
    configureSync: vi.fn(async () => ({ ok: true as const })),
-    testSyncConnection: vi.fn(async () => ({ ok: true as const }))
+    testSyncConnection: vi.fn(async () => ({ ok: true as const })),
+    // v0.3.1 Cut F — VisionSection 이 AiProviderSection 에 마운트되어 호출.
+    getVisionModels: vi.fn(async () => ({ models: [], at: null, selected: null })),
+    setVisionModel: vi.fn(async () => ({ ok: true as const })),
+    refreshVisionCache: vi.fn(async () => ({ ok: true as const, models: [] }))
  }
 }));

--- a/tests/unit/SettingsService.test.ts
+++ b/tests/unit/SettingsService.test.ts
@@ -90,4 +90,31 @@ describe('SettingsService', () => {
      await expect(svc.setSyncIntervalMin(10.5)).rejects.toThrow();
    });
  });
+
+  describe('v0.3.1 Cut F — vision settings', () => {
+    it('getVisionModel() defaults to null', async () => {
+      expect(await svc.getVisionModel()).toBeNull();
+    });
+
+    it('setVisionModel() / getVisionModel() round-trip including null clear', async () => {
+      await svc.setVisionModel('llava:13b');
+      expect(await svc.getVisionModel()).toBe('llava:13b');
+      await svc.setVisionModel(null);
+      expect(await svc.getVisionModel()).toBeNull();
+    });
+
+    it('getVisionCapableCache() defaults to empty models + null at', async () => {
+      const cache = await svc.getVisionCapableCache();
+      expect(cache.models).toEqual([]);
+      expect(cache.at).toBeNull();
+    });
+
+    it('setVisionCapableCache() persists models + ISO timestamp', async () => {
+      const now = new Date('2026-05-09T12:00:00.000Z');
+      await svc.setVisionCapableCache(['llava:13b', 'llava:7b'], now);
+      const cache = await svc.getVisionCapableCache();
+      expect(cache.models).toEqual(['llava:13b', 'llava:7b']);
+      expect(cache.at).toBe('2026-05-09T12:00:00.000Z');
+    });
+  });
 });
--- a/tests/unit/VisionDetect.test.ts
+++ b/tests/unit/VisionDetect.test.ts
@@ -0,0 +1,121 @@
+import { describe, it, expect, vi } from 'vitest';
+import { isVisionCapable, refreshVisionCache } from '@main/services/VisionDetect.js';
+import type { OllamaModel } from '@main/services/VisionDetect.js';
+
+// ---------------------------------------------------------------------------
+// isVisionCapable
+// ---------------------------------------------------------------------------
+describe('isVisionCapable', () => {
+  it('returns true when details.family is in VISION_FAMILIES', () => {
+    const model: OllamaModel = { name: 'some-model', details: { family: 'llava' } };
+    expect(isVisionCapable(model)).toBe(true);
+  });
+
+  it('returns true when details.families contains a vision family', () => {
+    const model: OllamaModel = { name: 'some-model', details: { families: ['text', 'minicpm-v'] } };
+    expect(isVisionCapable(model)).toBe(true);
+  });
+
+  it('returns true when name contains a vision hint (case-insensitive)', () => {
+    const model: OllamaModel = { name: 'My-Vision-Model:latest' };
+    expect(isVisionCapable(model)).toBe(true);
+  });
+
+  it('returns true when name contains "vl" hint', () => {
+    const model: OllamaModel = { name: 'qwen2-vl:7b' };
+    expect(isVisionCapable(model)).toBe(true);
+  });
+
+  it('returns false for a plain text model with no vision signals', () => {
+    const model: OllamaModel = { name: 'gemma2:9b', details: { family: 'gemma', families: ['gemma'] } };
+    expect(isVisionCapable(model)).toBe(false);
+  });
+
+  // v0.3.1 Cut F final fix — gemma family default 정정. gemma4 도 vision-capable hint.
+  it('returns true for gemma4 family (future-proof)', () => {
+    const model: OllamaModel = { name: 'gemma4-vision:e4b', details: { family: 'gemma4' } };
+    expect(isVisionCapable(model)).toBe(true);
+  });
+
+  it('returns true for gemma4 in name hints (no family)', () => {
+    const model: OllamaModel = { name: 'custom-gemma4:latest' };
+    expect(isVisionCapable(model)).toBe(true);
+  });
+});
+
+// ---------------------------------------------------------------------------
+// refreshVisionCache
+// ---------------------------------------------------------------------------
+describe('refreshVisionCache', () => {
+  function makeSettings(overrides: Partial<{
+    isAiEnabled: boolean;
+    setCalled: { models: string[]; at: Date } | null;
+  }> = {}) {
+    const setCalled: { models: string[]; at: Date } | null = null;
+    const settings = {
+      isAiEnabled: vi.fn().mockResolvedValue(overrides.isAiEnabled ?? true),
+      setVisionCapableCache: vi.fn().mockImplementation(async () => undefined),
+    };
+    return settings;
+  }
+
+  it('returns ok:false with reason "ai_disabled" when AI is off', async () => {
+    const settings = makeSettings({ isAiEnabled: false });
+    const result = await refreshVisionCache({
+      settings: settings as never,
+      endpoint: 'http://localhost:11434',
+    });
+    expect(result).toEqual({ ok: false, reason: 'ai_disabled' });
+    expect(settings.setVisionCapableCache).not.toHaveBeenCalled();
+  });
+
+  it('returns ok:false with http reason on non-ok response', async () => {
+    const settings = makeSettings();
+    const fetchImpl = vi.fn().mockResolvedValue({ ok: false, status: 503 });
+    const result = await refreshVisionCache({
+      settings: settings as never,
+      endpoint: 'http://localhost:11434',
+      fetchImpl: fetchImpl as never,
+    });
+    expect(result).toEqual({ ok: false, reason: 'tags http 503' });
+  });
+
+  it('returns ok:false with unreachable reason on fetch throw', async () => {
+    const settings = makeSettings();
+    const fetchImpl = vi.fn().mockRejectedValue(new Error('ECONNREFUSED'));
+    const result = await refreshVisionCache({
+      settings: settings as never,
+      endpoint: 'http://localhost:11434',
+      fetchImpl: fetchImpl as never,
+    });
+    expect(result.ok).toBe(false);
+    if (!result.ok) expect(result.reason).toMatch(/unreachable/);
+  });
+
+  it('filters vision-capable models, persists cache, returns ok:true + models', async () => {
+    const settings = makeSettings();
+    const fixedNow = new Date('2026-05-09T00:00:00.000Z');
+    const responseBody = {
+      models: [
+        { name: 'llava:13b', details: { family: 'llava' } },
+        { name: 'gemma2:9b', details: { family: 'gemma' } },
+        { name: 'qwen2-vl:7b' },
+      ],
+    };
+    const fetchImpl = vi.fn().mockResolvedValue({
+      ok: true,
+      json: () => Promise.resolve(responseBody),
+    });
+    const result = await refreshVisionCache({
+      settings: settings as never,
+      endpoint: 'http://localhost:11434',
+      fetchImpl: fetchImpl as never,
+      now: () => fixedNow,
+    });
+    expect(result).toEqual({ ok: true, models: ['llava:13b', 'qwen2-vl:7b'] });
+    expect(settings.setVisionCapableCache).toHaveBeenCalledWith(
+      ['llava:13b', 'qwen2-vl:7b'],
+      fixedNow
+    );
+  });
+});
--- a/tests/unit/VisionSection.test.tsx
+++ b/tests/unit/VisionSection.test.tsx
@@ -0,0 +1,75 @@
+// @vitest-environment jsdom
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import '@testing-library/jest-dom/vitest';
+import { render, screen, fireEvent, cleanup, waitFor } from '@testing-library/react';
+import React from 'react';
+
+const { mockGet, mockSet, mockRefresh } = vi.hoisted(() => ({
+  mockGet: vi.fn(),
+  mockSet: vi.fn(),
+  mockRefresh: vi.fn()
+}));
+
+vi.mock('../../src/renderer/inbox/api.js', () => ({
+  inboxApi: {
+    getVisionModels: mockGet,
+    setVisionModel: mockSet,
+    refreshVisionCache: mockRefresh
+  }
+}));
+
+import { VisionSection } from '../../src/renderer/inbox/components/settings/VisionSection';
+
+describe('VisionSection', () => {
+  beforeEach(() => {
+    vi.clearAllMocks();
+    cleanup();
+    mockGet.mockResolvedValue({
+      models: ['gemma3:12b-vision', 'llava:13b'],
+      at: '2026-05-10T05:00:00Z',
+      selected: 'gemma3:12b-vision'
+    });
+    mockSet.mockResolvedValue({ ok: true });
+    mockRefresh.mockResolvedValue({ ok: true, models: ['gemma3:12b-vision', 'llava:13b'] });
+  });
+
+  it('open 시 cache 로드 + dropdown 옵션 표시 + 선택된 모델 default', async () => {
+    render(<VisionSection />);
+    await waitFor(() => {
+      expect(screen.getByLabelText('이미지 분석 모델')).toHaveValue('gemma3:12b-vision');
+    });
+    expect(screen.getByText('gemma3:12b-vision')).toBeInTheDocument();
+    expect(screen.getByText('llava:13b')).toBeInTheDocument();
+    expect(screen.getByText(/마지막 감지/)).toBeInTheDocument();
+  });
+
+  it('dropdown 변경 → setVisionModel 호출', async () => {
+    render(<VisionSection />);
+    await waitFor(() => screen.getByLabelText('이미지 분석 모델'));
+    fireEvent.change(screen.getByLabelText('이미지 분석 모델'), { target: { value: 'llava:13b' } });
+    await waitFor(() => {
+      expect(mockSet).toHaveBeenCalledWith('llava:13b');
+    });
+  });
+
+  it('비활성 선택 → setVisionModel(null)', async () => {
+    render(<VisionSection />);
+    await waitFor(() => screen.getByLabelText('이미지 분석 모델'));
+    fireEvent.change(screen.getByLabelText('이미지 분석 모델'), { target: { value: '' } });
+    await waitFor(() => {
+      expect(mockSet).toHaveBeenCalledWith(null);
+    });
+  });
+
+  it('다시 감지 클릭 → refreshVisionCache 호출 + 결과 표시', async () => {
+    render(<VisionSection />);
+    await waitFor(() => screen.getByRole('button', { name: /다시 감지/ }));
+    fireEvent.click(screen.getByRole('button', { name: /다시 감지/ }));
+    await waitFor(() => {
+      expect(mockRefresh).toHaveBeenCalled();
+    });
+    await waitFor(() => {
+      expect(screen.getByText(/감지 완료/)).toBeInTheDocument();
+    });
+  });
+});
--- a/tests/unit/vision-ipc.test.ts
+++ b/tests/unit/vision-ipc.test.ts
@@ -0,0 +1,125 @@
+import { describe, it, expect, beforeEach, vi } from 'vitest';
+
+vi.mock('electron', () => ({ default: { ipcMain: { handle: vi.fn() }, dialog: {}, shell: {} } }));
+vi.mock('../../src/main/services/VisionDetect.js', () => ({
+  refreshVisionCache: vi.fn(async () => ({ ok: true as const, models: ['gemma3:12b-vision'] }))
+}));
+vi.mock('../../src/main/services/GitClient.js');
+
+import electron from 'electron';
+import { refreshVisionCache } from '../../src/main/services/VisionDetect.js';
+import { registerSettingsApi } from '../../src/main/ipc/settingsApi.js';
+import type { SettingsIpcDeps } from '../../src/main/ipc/settingsApi.js';
+
+function getHandler(channel: string): (...args: unknown[]) => unknown {
+  const handle = (electron.ipcMain as unknown as { handle: ReturnType<typeof vi.fn> }).handle;
+  const call = handle.mock.calls.find((c) => c[0] === channel);
+  if (!call) throw new Error(`channel ${channel} not registered`);
+  return call[1] as (...args: unknown[]) => unknown;
+}
+
+function makeDeps() {
+  const settings = {
+    getVisionModel: vi.fn(async () => 'gemma3:12b-vision'),
+    setVisionModel: vi.fn(async () => {}),
+    getVisionCapableCache: vi.fn(async () => ({
+      models: ['gemma3:12b-vision', 'llava:13b'],
+      at: '2026-05-10T05:00:00Z'
+    })),
+    setVisionCapableCache: vi.fn(async () => {}),
+    // existing methods used by other handlers
+    getAll: vi.fn(async () => ({
+      ollama: { endpoint: 'http://localhost:11434', model: 'gemma2:2b' }
+    })),
+    setAiEnabled: vi.fn(async () => {}),
+    setOnboardingCompleted: vi.fn(async () => {}),
+    isAiEnabled: vi.fn(async () => true),
+    getSyncRepoUrl: vi.fn(async () => null),
+    setSyncRepoUrl: vi.fn(async () => {}),
+    isAutoSyncEnabled: vi.fn(async () => false),
+    getSyncIntervalMin: vi.fn(async () => 30),
+    setSyncIntervalMin: vi.fn(async () => {}),
+    setAutoSyncEnabled: vi.fn(async () => {})
+  };
+
+  const syncSvc = {
+    getSyncDir: vi.fn(() => '/tmp/sync'),
+    listConflicts: vi.fn(() => []),
+    resolveConflict: vi.fn(async () => ({ ok: true as const })),
+    getLastStatus: vi.fn(() => ({ lastAt: null as string | null, lastResult: null as { ok: boolean } | null }))
+  };
+
+  const deps: Partial<SettingsIpcDeps> = {
+    backup: { runDaily: vi.fn(async () => ({ snapshotted: false })) } as never,
+    exportSvc: {} as never,
+    importSvc: {} as never,
+    syncSvc: syncSvc as never,
+    telemetry: { exportTo: vi.fn(async () => ({ eventCount: 0 })) } as never,
+    settings: settings as never,
+    getInboxWindow: () => null
+  };
+
+  return { settings, syncSvc, deps };
+}
+
+describe('vision IPC channels', () => {
+  beforeEach(() => {
+    (electron.ipcMain as unknown as { handle: ReturnType<typeof vi.fn> }).handle.mockClear();
+    vi.clearAllMocks();
+  });
+
+  it('3 vision channels registered', () => {
+    const { deps } = makeDeps();
+    registerSettingsApi(deps as SettingsIpcDeps);
+
+    const handle = (electron.ipcMain as unknown as { handle: ReturnType<typeof vi.fn> }).handle;
+    const channels = handle.mock.calls.map((c) => c[0]);
+    expect(channels).toContain('settings:get-vision-models');
+    expect(channels).toContain('settings:set-vision-model');
+    expect(channels).toContain('settings:refresh-vision-cache');
+  });
+
+  it('settings:get-vision-models returns { models, at, selected } from settings', async () => {
+    const { deps, settings } = makeDeps();
+    registerSettingsApi(deps as SettingsIpcDeps);
+    const h = getHandler('settings:get-vision-models');
+    const r = await h({});
+    expect(settings.getVisionCapableCache).toHaveBeenCalled();
+    expect(settings.getVisionModel).toHaveBeenCalled();
+    expect(r).toEqual({
+      models: ['gemma3:12b-vision', 'llava:13b'],
+      at: '2026-05-10T05:00:00Z',
+      selected: 'gemma3:12b-vision'
+    });
+  });
+
+  it('settings:set-vision-model calls settings.setVisionModel(value) + returns { ok: true }', async () => {
+    const { deps, settings } = makeDeps();
+    registerSettingsApi(deps as SettingsIpcDeps);
+    const h = getHandler('settings:set-vision-model');
+    const r = await h({}, 'llava:13b');
+    expect(settings.setVisionModel).toHaveBeenCalledWith('llava:13b');
+    expect(r).toEqual({ ok: true });
+  });
+
+  it('settings:refresh-vision-cache calls refreshVisionCache and returns result', async () => {
+    const { deps } = makeDeps();
+    registerSettingsApi(deps as SettingsIpcDeps);
+    const h = getHandler('settings:refresh-vision-cache');
+    const r = await h({});
+    expect(refreshVisionCache).toHaveBeenCalledWith({
+      settings: deps.settings,
+      endpoint: 'http://localhost:11434'
+    });
+    expect(r).toEqual({ ok: true, models: ['gemma3:12b-vision'] });
+  });
+
+  it('settings:set-vision-model with null clears the value', async () => {
+    const { deps, settings } = makeDeps();
+    registerSettingsApi(deps as SettingsIpcDeps);
+    const h = getHandler('settings:set-vision-model');
+    const r = await h({}, null);
+    expect(settings.setVisionModel).toHaveBeenCalledWith(null);
+    expect(r).toEqual({ ok: true });
+  });
+});
--- a/tests/unit/visionPrompt.test.ts
+++ b/tests/unit/visionPrompt.test.ts
@@ -0,0 +1,23 @@
+import { describe, it, expect } from 'vitest';
+import { buildVisionPrompt } from '@main/ai/visionPrompt.js';
+
+describe('buildVisionPrompt', () => {
+  it('includes text, todayKst, dueCandidates, and vocab slice when present', () => {
+    const result = buildVisionPrompt(
+      '회의 메모',
+      '2026-05-09',
+      ['2026-05-10', '2026-05-15'],
+      ['work', 'meeting', 'project', 'todo']
+    );
+    expect(result).toContain('회의 메모');
+    expect(result).toContain('2026-05-09');
+    expect(result).toContain('2026-05-10, 2026-05-15');
+    expect(result).toContain('work, meeting, project, todo');
+  });
+
+  it('uses (이미지만 있음) placeholder when text is empty', () => {
+    const result = buildVisionPrompt('', '2026-05-09', [], []);
+    expect(result).toContain('(이미지만 있음)');
+    expect(result).not.toContain('\n\n\n'); // no double-blank from empty text
+  });
+});