Merge pull request 'feat(fb-36): search filter args (--media / --ingested-after / --doc-id + 4 existing)' (#127) from feat/fb-36-search-filters into main

Reviewed-on: #127
2026-05-10 02:02:24 +00:00
parent a7115be699 84287d0ef6
commit a72c6f307c
19 changed files with 2946 additions and 11 deletions
--- a/README.md
+++ b/README.md
@@ -71,7 +71,7 @@ kebab doctor
 |------|------|
 | `kebab init` | XDG 경로에 데이터 디렉토리 + config.toml 생성 |
 | `kebab ingest [<path>]` | Markdown / 이미지 / PDF 색인 (idempotent). TTY 에서는 stderr 진행 바, non-TTY (CI / pipe) 는 stderr 한 줄씩, `--json` 은 stdout 에 `ingest_progress.v1` 라인 streaming 후 마지막에 `ingest_report.v1`. Ctrl-C 한 번이면 현재 asset 마무리 후 abort (부분 commit 보존, idempotent re-run), 두 번째 Ctrl-C 는 hard exit. Markdown title 이 frontmatter 에 없어도 첫 H1 → H2 → 첫 paragraph 80 자 → 파일명 순으로 자동 채움 (parser_version `md-frontmatter-v2`) — 기존 색인된 doc 도 다음 ingest 에서 새 title 로 갱신. **Incremental** (p9-fb-23): 두 번째 이후의 ingest 는 변하지 않은 doc (blake3 + parser/chunker/embedder version 모두 동일) 의 parse/chunk/embed/vector upsert 를 자동 스킵. final summary 에 `N unchanged` 카운트 표시. `--force-reingest` 로 skip 무시 강제 재처리. **지원 형식** (extractor 자동 결정 — config 에 명시 불가): Markdown (`.md`), 이미지 (`.png` / `.jpg` / `.jpeg`, OCR + caption), PDF (`.pdf`). 다른 확장자는 자동 skip — `IngestItem.warnings` 에 사유 (`"unsupported media type: .docx"` 등), `IngestReport.skipped_by_extension` 에 카운트 분류, CLI / TUI summary 에 breakdown 표시. |
-| `kebab search --mode {lexical,vector,hybrid} "<query>" [--no-cache] [--max-tokens N] [--snippet-chars N] [--cursor <opaque>]` | 검색. hybrid는 RRF fusion, citation 포함. 같은 process 안에서 동일 query (NFKC + trim + lowercase 정규화) 반복 시 in-process LRU 캐시 hit (capacity = `[search] cache_capacity`, default 256). `--no-cache` 로 강제 bypass — 디버깅용. ingest commit 발생 시 `kv['corpus_revision']` bump 으로 모든 entry 자동 stale. **`--max-tokens` / `--snippet-chars` / `--cursor` (p9-fb-34)** — agent budget controls. `--json` 출력은 `search_response.v1` wrapper (`{hits, next_cursor, truncated}`) — pre-fb-34 의 bare array 와 호환 안 됨. mismatched cursor → `error.v1.code = stale_cursor` |
+| `kebab search --mode {lexical,vector,hybrid} "<query>" [--no-cache] [--max-tokens N] [--snippet-chars N] [--cursor <opaque>] [--tag T] [--lang L] [--path-glob G] [--trust-min LEVEL] [--media TYPE] [--ingested-after RFC3339] [--doc-id ID]` | 검색. hybrid는 RRF fusion, citation 포함. 같은 process 안에서 동일 query (NFKC + trim + lowercase 정규화) 반복 시 in-process LRU 캐시 hit (capacity = `[search] cache_capacity`, default 256). `--no-cache` 로 강제 bypass — 디버깅용. ingest commit 발생 시 `kv['corpus_revision']` bump 으로 모든 entry 자동 stale. **`--max-tokens` / `--snippet-chars` / `--cursor` (p9-fb-34)** — agent budget controls. `--json` 출력은 `search_response.v1` wrapper (`{hits, next_cursor, truncated}`) — pre-fb-34 의 bare array 와 호환 안 됨. mismatched cursor → `error.v1.code = stale_cursor`. **filter flags (p9-fb-36):** `--tag` 는 반복 가능 flag (`--tag rust --tag async`) 로 OR 매칭, `--media` 는 `,` 구분 다중 값 OR 매칭, 나머지 flags 간은 AND 조합. `--trust-min` 은 `primary\|secondary\|generated` 중 하나 (해당 level 이상 포함). `--ingested-after` 는 RFC3339 UTC — 파싱 실패 시 `error.v1.code = config_invalid` (exit 2). `--media md` 는 `markdown` alias 로 정규화. 알 수 없는 `--media` 값은 무조건 empty hits (오류 아님). |
 | `kebab list docs` | 색인된 문서 목록 |
 | `kebab inspect doc <id>` / `kebab inspect chunk <id>` | raw record 보기 |
 | `kebab fetch chunk <id> [--context N]` / `kebab fetch doc <id> [--max-tokens N]` / `kebab fetch span <doc_id> <ls> <le> [--max-tokens N]` | (p9-fb-35) verbatim text fetch from indexed corpus. wire = `fetch_result.v1` (kind discriminator). chunk: target + ±N ordinal-context chunks. doc: full normalized markdown. span: 1-based line range (PDF/audio rejected as `error.v1.code = span_not_supported`). chars/4 budget on doc/span. |
--- a/crates/kebab-cli/src/main.rs
+++ b/crates/kebab-cli/src/main.rs
@@ -131,6 +131,38 @@ enum Cmd {
        /// `corpus_revision` returns `error.v1.code = stale_cursor`.
        #[arg(long)]
        cursor: Option<String>,
+
+        /// p9-fb-36: filter by `metadata.tags`. Repeatable; OR-within (any tag).
+        #[arg(long)]
+        tag: Vec<String>,
+
+        /// p9-fb-36: filter by `documents.lang` (ISO code).
+        #[arg(long)]
+        lang: Option<String>,
+
+        /// p9-fb-36: filter by `documents.workspace_path` glob.
+        #[arg(long)]
+        path_glob: Option<String>,
+
+        /// p9-fb-36: filter by minimum `documents.trust_level`.
+        #[arg(long, value_enum)]
+        trust_min: Option<TrustLevelFlag>,
+
+        /// p9-fb-36: filter by `assets.media_type` kind. Comma-separated.
+        /// Aliases: `md` → `markdown`. Other accepted: `markdown`, `pdf`,
+        /// `image`, `audio`, `other`. Unknown values match nothing.
+        #[arg(long, value_delimiter = ',')]
+        media: Vec<String>,
+
+        /// p9-fb-36: filter to docs whose `updated_at` is >= this RFC3339
+        /// timestamp (UTC). Invalid format → exit 2 with error.v1
+        /// code = config_invalid.
+        #[arg(long)]
+        ingested_after: Option<String>,
+
+        /// p9-fb-36: filter to a single doc by id.
+        #[arg(long)]
+        doc_id: Option<String>,
    },

    /// Retrieval-augmented question answering.
@@ -351,6 +383,25 @@ impl From<ModeFlag> for kebab_core::SearchMode {
    }
 }

+/// p9-fb-36: clap value enum for `--trust-min`. Maps to
+/// `kebab_core::TrustLevel` via `From`.
+#[derive(clap::ValueEnum, Clone, Debug)]
+enum TrustLevelFlag {
+    Primary,
+    Secondary,
+    Generated,
+}
+
+impl From<TrustLevelFlag> for kebab_core::TrustLevel {
+    fn from(f: TrustLevelFlag) -> Self {
+        match f {
+            TrustLevelFlag::Primary => kebab_core::TrustLevel::Primary,
+            TrustLevelFlag::Secondary => kebab_core::TrustLevel::Secondary,
+            TrustLevelFlag::Generated => kebab_core::TrustLevel::Generated,
+        }
+    }
+}
+
 /// Parse boolean env var accepting "1", "true", "yes", "on" (case-insensitive)
 /// as truthy; "0", "false", "no", "off" as falsy. Used for `KEBAB_READONLY`.
 fn parse_bool_env(s: &str) -> Result<bool, String> {
@@ -611,13 +662,71 @@ fn run(cli: &Cli) -> anyhow::Result<()> {
            max_tokens,
            snippet_chars,
            cursor,
+            tag,
+            lang,
+            path_glob,
+            trust_min,
+            media,
+            ingested_after,
+            doc_id,
        } => {
            let cfg = kebab_config::Config::load(cli.config.as_deref())?;
+
+            // p9-fb-36: normalize --media aliases (md → markdown).
+            fn normalize_media_alias(s: &str) -> String {
+                match s.to_ascii_lowercase().as_str() {
+                    "md" => "markdown".to_string(),
+                    other => other.to_string(),
+                }
+            }
+            let media_norm: Vec<String> =
+                media.iter().map(|s| normalize_media_alias(s)).collect();
+
+            // p9-fb-36: parse --ingested-after as RFC3339; structured error on failure.
+            let ingested_after_parsed: Option<time::OffsetDateTime> =
+                match ingested_after.as_deref() {
+                    Some(s) => {
+                        match time::OffsetDateTime::parse(
+                            s,
+                            &time::format_description::well_known::Rfc3339,
+                        ) {
+                            Ok(ts) => Some(ts),
+                            Err(e) => {
+                                return Err(anyhow::Error::new(
+                                    kebab_app::StructuredError(kebab_app::ErrorV1 {
+                                        schema_version: kebab_app::ERROR_V1_ID.to_string(),
+                                        code: "config_invalid".to_string(),
+                                        message: format!(
+                                            "--ingested-after: invalid RFC3339 timestamp '{s}': {e}"
+                                        ),
+                                        details: serde_json::Value::Null,
+                                        hint: Some(
+                                            "expected format like 2026-04-01T00:00:00Z".to_string(),
+                                        ),
+                                    }),
+                                ));
+                            }
+                        }
+                    }
+                    None => None,
+                };
+
+            // p9-fb-36: build SearchFilters from the 7 new flags.
+            let filters = kebab_core::SearchFilters {
+                tags_any: tag.clone(),
+                lang: lang.as_ref().map(|s| kebab_core::Lang(s.clone())),
+                path_glob: path_glob.clone(),
+                trust_min: trust_min.clone().map(Into::into),
+                media: media_norm,
+                ingested_after: ingested_after_parsed,
+                doc_id: doc_id.as_ref().map(|s| kebab_core::DocumentId(s.clone())),
+            };
+
            let q = kebab_core::SearchQuery {
                text: query.clone(),
                mode: (*mode).into(),
                k: *k,
-                filters: kebab_core::SearchFilters::default(),
+                filters,
            };
            let opts = kebab_core::SearchOpts {
                max_tokens: *max_tokens,
--- a/crates/kebab-cli/tests/wire_search_filters.rs
+++ b/crates/kebab-cli/tests/wire_search_filters.rs
@@ -0,0 +1,306 @@
+//! p9-fb-36: CLI integration tests for search filter flags.
+//!
+//! Lexical-only — no fastembed / no Ollama. Each test builds its own
+//! TempDir KB via `common::write_config` + `common::ingest` and drives
+//! `kebab search` through `common::run_search_with_args` or direct
+//! `Command` invocations. Verifies:
+//!
+//! - `--doc-id <id>` restricts all returned hits to the target document.
+//! - `--ingested-after <bad>` exits non-zero and emits `error.v1` on
+//!   stderr with `code = "config_invalid"`.
+//! - `--media md` (alias) normalises to `markdown` and matches `.md` docs.
+//! - `--tag <tag>` (repeatable, OR-within) filters by frontmatter tags.
+
+mod common;
+
+use serde_json::Value;
+use std::fs;
+use std::process::Command;
+
+// ---------------------------------------------------------------------------
+// Test 1: --doc-id restricts hits to a single document
+// ---------------------------------------------------------------------------
+
+#[test]
+fn search_with_doc_id_filter_returns_only_target_doc() {
+    let dir = tempfile::tempdir().unwrap();
+    let (cfg, workspace, _data) = common::write_config(dir.path(), 30);
+
+    // Two docs that both contain the search term.
+    fs::write(workspace.join("a.md"), "# Alpha\n\nrust ownership rules\n").unwrap();
+    fs::write(workspace.join("b.md"), "# Beta\n\nrust borrow checker\n").unwrap();
+    common::ingest(&cfg, &workspace);
+
+    // First, search without a doc-id filter to find what doc_ids exist.
+    let (stdout, _) = common::run_search_with_args(
+        &cfg,
+        &["--json", "--mode", "lexical", "rust"],
+    );
+    let resp: Value = serde_json::from_str(stdout.trim())
+        .unwrap_or_else(|e| panic!("not JSON: {stdout:?}: {e}"));
+    let hits = resp["hits"].as_array().expect("hits array");
+    assert!(
+        hits.len() >= 2,
+        "expected ≥2 hits from two docs before filter: {resp}"
+    );
+
+    // Grab one doc_id from the results.
+    let target_doc_id = hits[0]["doc_id"]
+        .as_str()
+        .expect("doc_id string")
+        .to_string();
+
+    // Re-search with --doc-id set to the first hit's doc_id.
+    let (stdout2, _) = common::run_search_with_args(
+        &cfg,
+        &[
+            "--json",
+            "--mode",
+            "lexical",
+            "--doc-id",
+            &target_doc_id,
+            "rust",
+        ],
+    );
+    let resp2: Value = serde_json::from_str(stdout2.trim())
+        .unwrap_or_else(|e| panic!("not JSON after filter: {stdout2:?}: {e}"));
+    let filtered_hits = resp2["hits"].as_array().expect("hits array (filtered)");
+
+    assert!(
+        !filtered_hits.is_empty(),
+        "expected at least one hit for the target doc"
+    );
+    for hit in filtered_hits {
+        let got = hit["doc_id"].as_str().expect("doc_id string in hit");
+        assert_eq!(
+            got, target_doc_id,
+            "--doc-id filter must restrict all hits to target doc, got {got}"
+        );
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Test 2: --ingested-after with bad RFC3339 → exit non-zero + error.v1
+// ---------------------------------------------------------------------------
+
+#[test]
+fn search_with_invalid_ingested_after_emits_config_invalid() {
+    let dir = tempfile::tempdir().unwrap();
+    let (cfg, workspace, _data) = common::write_config(dir.path(), 30);
+    fs::write(workspace.join("a.md"), "# T\n\nrust stuff\n").unwrap();
+    common::ingest(&cfg, &workspace);
+
+    let bin = env!("CARGO_BIN_EXE_kebab");
+    let out = Command::new(bin)
+        .args([
+            "--config",
+            cfg.to_str().unwrap(),
+            "--json",
+            "search",
+            "--mode",
+            "lexical",
+            "--ingested-after",
+            "not-a-date",
+            "rust",
+        ])
+        .output()
+        .expect("kebab search --ingested-after bad");
+
+    assert!(
+        !out.status.success(),
+        "expected non-zero exit for invalid --ingested-after, got: status={} stderr={}",
+        out.status,
+        String::from_utf8_lossy(&out.stderr)
+    );
+
+    let stderr = String::from_utf8_lossy(&out.stderr);
+    // Find the error.v1 ndjson line on stderr (one JSON event per line).
+    let err_line = stderr
+        .lines()
+        .find(|l| {
+            serde_json::from_str::<Value>(l)
+                .ok()
+                .and_then(|v| {
+                    v.get("schema_version")
+                        .and_then(|s| s.as_str())
+                        .map(String::from)
+                })
+                .as_deref()
+                == Some("error.v1")
+        })
+        .unwrap_or_else(|| panic!("no error.v1 line on stderr: {stderr:?}"));
+
+    let v: Value = serde_json::from_str(err_line).expect("error.v1 json");
+    assert_eq!(
+        v["code"], "config_invalid",
+        "code must be config_invalid for bad RFC3339: {err_line}"
+    );
+}
+
+// ---------------------------------------------------------------------------
+// Test 3: --media md (alias) normalises to markdown and matches .md docs
+// ---------------------------------------------------------------------------
+
+#[test]
+fn search_with_media_filter_md_alias_normalizes_to_markdown() {
+    let dir = tempfile::tempdir().unwrap();
+    let (cfg, workspace, _data) = common::write_config(dir.path(), 30);
+
+    // Only a markdown file — the `md` alias should match it.
+    fs::write(workspace.join("notes.md"), "# Notes\n\nrust async programming\n").unwrap();
+    common::ingest(&cfg, &workspace);
+
+    let (stdout, _) = common::run_search_with_args(
+        &cfg,
+        &["--json", "--mode", "lexical", "--media", "md", "rust"],
+    );
+    let resp: Value = serde_json::from_str(stdout.trim())
+        .unwrap_or_else(|e| panic!("not JSON: {stdout:?}: {e}"));
+    let hits = resp["hits"].as_array().expect("hits array");
+
+    assert!(
+        !hits.is_empty(),
+        "--media md must match the markdown doc; got 0 hits: {resp}"
+    );
+}
+
+// ---------------------------------------------------------------------------
+// Test 4: --tag (repeatable, OR-within) filters by frontmatter tags
+// ---------------------------------------------------------------------------
+
+#[test]
+fn search_with_tag_filter_matches_frontmatter_tags() {
+    let dir = tempfile::tempdir().unwrap();
+    let (cfg, workspace, _data) = common::write_config(dir.path(), 30);
+
+    // Doc with `rust` tag.
+    fs::write(
+        workspace.join("rust_doc.md"),
+        "---\ntags: [rust, systems]\n---\n# Rust\n\nrust ownership\n",
+    )
+    .unwrap();
+    // Doc without the tag (but same keyword in body so it appears in
+    // unfiltered results — the tag filter must exclude it).
+    fs::write(
+        workspace.join("other_doc.md"),
+        "# Other\n\nrust programming\n",
+    )
+    .unwrap();
+    common::ingest(&cfg, &workspace);
+
+    // Without filter — both docs must produce hits.
+    let (unfiltered, _) = common::run_search_with_args(
+        &cfg,
+        &["--json", "--mode", "lexical", "rust"],
+    );
+    let uresp: Value = serde_json::from_str(unfiltered.trim())
+        .unwrap_or_else(|e| panic!("not JSON (unfiltered): {unfiltered:?}: {e}"));
+    let uhits = uresp["hits"].as_array().expect("unfiltered hits array");
+    assert!(
+        uhits.len() >= 2,
+        "expected ≥2 hits before tag filter: {uresp}"
+    );
+
+    // With --tag rust — only the tagged doc's hits should appear.
+    let (filtered, _) = common::run_search_with_args(
+        &cfg,
+        &["--json", "--mode", "lexical", "--tag", "rust", "rust"],
+    );
+    let fresp: Value = serde_json::from_str(filtered.trim())
+        .unwrap_or_else(|e| panic!("not JSON (tag-filtered): {filtered:?}: {e}"));
+    let fhits = fresp["hits"].as_array().expect("filtered hits array");
+
+    assert!(
+        !fhits.is_empty(),
+        "--tag rust must match the tagged doc; got 0 hits: {fresp}"
+    );
+
+    // Every returned hit must come from rust_doc.md (the tagged file).
+    for hit in fhits {
+        let path = hit["doc_path"].as_str().unwrap_or("");
+        assert!(
+            path.ends_with("rust_doc.md"),
+            "--tag rust must only return hits from the tagged doc, got path={path}"
+        );
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Test 5: --tag is repeatable (OR-within); two --tag values form an IN-list
+// ---------------------------------------------------------------------------
+
+#[test]
+fn search_with_two_tag_filters_returns_or_within_tags() {
+    // Two docs with different tag sets:
+    //   a.md → tags: [rust]
+    //   b.md → tags: [async]
+    //   c.md → no tags (but same keyword in body)
+    // Search with --tag rust --tag async (OR within --tag).
+    // Expect a.md and b.md, not c.md.
+    let dir = tempfile::tempdir().unwrap();
+    let (cfg, workspace, _data) = common::write_config(dir.path(), 30);
+
+    fs::write(
+        workspace.join("a.md"),
+        "---\ntags: [rust]\n---\n# A\n\nrust systems programming\n",
+    )
+    .unwrap();
+    fs::write(
+        workspace.join("b.md"),
+        "---\ntags: [async]\n---\n# B\n\nrust async programming\n",
+    )
+    .unwrap();
+    fs::write(workspace.join("c.md"), "# C\n\nrust programming\n").unwrap();
+    common::ingest(&cfg, &workspace);
+
+    // Without filter: all three docs produce hits.
+    let (unfiltered, _) = common::run_search_with_args(
+        &cfg,
+        &["--json", "--mode", "lexical", "rust"],
+    );
+    let uresp: Value = serde_json::from_str(unfiltered.trim())
+        .unwrap_or_else(|e| panic!("not JSON (unfiltered): {unfiltered:?}: {e}"));
+    let uhits = uresp["hits"].as_array().expect("unfiltered hits array");
+    assert!(
+        uhits.len() >= 3,
+        "expected ≥3 hits before tag filter: {uresp}"
+    );
+
+    // With --tag rust --tag async: only a.md and b.md should appear.
+    let (filtered, _) = common::run_search_with_args(
+        &cfg,
+        &[
+            "--json", "--mode", "lexical",
+            "--tag", "rust",
+            "--tag", "async",
+            "rust",
+        ],
+    );
+    let fresp: Value = serde_json::from_str(filtered.trim())
+        .unwrap_or_else(|e| panic!("not JSON (two-tag-filtered): {filtered:?}: {e}"));
+    let fhits = fresp["hits"].as_array().expect("filtered hits array");
+
+    assert!(
+        !fhits.is_empty(),
+        "--tag rust --tag async must return hits from tagged docs; got 0: {fresp}"
+    );
+
+    // c.md must not appear — it has no tags.
+    for hit in fhits {
+        let path = hit["doc_path"].as_str().unwrap_or("");
+        assert!(
+            path.ends_with("a.md") || path.ends_with("b.md"),
+            "--tag rust --tag async must only return a.md or b.md, got path={path}"
+        );
+    }
+
+    // Both a.md and b.md must appear (OR, not AND).
+    let paths: Vec<&str> = fhits
+        .iter()
+        .filter_map(|h| h["doc_path"].as_str())
+        .collect();
+    let has_a = paths.iter().any(|p| p.ends_with("a.md"));
+    let has_b = paths.iter().any(|p| p.ends_with("b.md"));
+    assert!(has_a, "--tag rust must include a.md (rust-tagged): paths={paths:?}");
+    assert!(has_b, "--tag async must include b.md (async-tagged): paths={paths:?}");
+}
--- a/crates/kebab-core/src/search.rs
+++ b/crates/kebab-core/src/search.rs
@@ -26,12 +26,30 @@ pub struct SearchQuery {
    pub filters: SearchFilters,
 }

+/// p9-fb-36: canonical kind labels for `SearchFilters.media`. Mirrors
+/// `MediaType` variant tags; CLI / MCP normalize aliases (`md` → `markdown`)
+/// before populating this Vec.
+pub const MEDIA_KINDS: &[&str] = &["markdown", "pdf", "image", "audio", "other"];
+
 #[derive(Clone, Debug, Default, PartialEq, Serialize, Deserialize)]
 pub struct SearchFilters {
    pub tags_any: Vec<String>,
    pub lang: Option<Lang>,
    pub path_glob: Option<String>,
    pub trust_min: Option<TrustLevel>,
+    /// p9-fb-36: media_type filter — IN-list of `MediaType.kind`
+    /// strings (`"markdown"`, `"pdf"`, `"image"`, `"audio"`, `"other"`).
+    /// Empty Vec = no filter. Match is on the variant tag only;
+    /// e.g. `["image"]` matches `Image(Png)` and `Image(Jpeg)`.
+    #[serde(default)]
+    pub media: Vec<String>,
+    /// p9-fb-36: hits whose source doc's `documents.updated_at` is at
+    /// or after this timestamp. None = no filter. RFC3339 / UTC.
+    #[serde(default, with = "time::serde::rfc3339::option")]
+    pub ingested_after: Option<OffsetDateTime>,
+    /// p9-fb-36: restrict hits to a single document. None = no filter.
+    #[serde(default)]
+    pub doc_id: Option<DocumentId>,
 }

 #[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
@@ -155,4 +173,24 @@ mod tests {
        assert!(opts.snippet_chars.is_none());
        assert!(opts.cursor.is_none());
    }
+
+    #[test]
+    fn search_filters_default_includes_new_fb36_fields() {
+        let f = SearchFilters::default();
+        assert!(f.media.is_empty(), "media default empty");
+        assert!(f.ingested_after.is_none(), "ingested_after default None");
+        assert!(f.doc_id.is_none(), "doc_id default None");
+        assert!(f.tags_any.is_empty());
+        assert!(f.lang.is_none());
+        assert!(f.path_glob.is_none());
+        assert!(f.trust_min.is_none());
+    }
+
+    #[test]
+    fn search_filters_serialize_with_serde_default_compat() {
+        let old: SearchFilters = serde_json::from_str(r#"{"tags_any":[],"lang":null,"path_glob":null,"trust_min":null}"#).unwrap();
+        assert!(old.media.is_empty());
+        assert!(old.ingested_after.is_none());
+        assert!(old.doc_id.is_none());
+    }
 }
--- a/crates/kebab-mcp/Cargo.toml
+++ b/crates/kebab-mcp/Cargo.toml
@@ -19,6 +19,8 @@ tracing     = { workspace = true }
 # /dependencies endpoint — rmcp declares optional schemars = "^1.0").
 schemars    = "1"

+time         = { workspace = true }
+
 kebab-app    = { path = "../kebab-app" }
 kebab-config = { path = "../kebab-config" }
 kebab-core   = { path = "../kebab-core" }
--- a/crates/kebab-mcp/src/tools/search.rs
+++ b/crates/kebab-mcp/src/tools/search.rs
@@ -1,5 +1,7 @@
 //! `search` tool — wraps `kebab_app::search_with_opts_with_config`.
-//! Input: { query, mode?, k?, max_tokens?, snippet_chars?, cursor? }.
+//! Input: { query, mode?, k?, max_tokens?, snippet_chars?, cursor?,
+//!          tags?, lang?, path_glob?, trust_min?, media?,
+//!          ingested_after?, doc_id? }.
 //! Output: search_response.v1 envelope (hits + next_cursor + truncated).
 //!
 //! First tool with a non-empty `inputSchema`: `SearchInput` derives
@@ -10,6 +12,8 @@ use rmcp::model::CallToolResult;
 use schemars::JsonSchema;
 use serde::{Deserialize, Serialize};

+use kebab_app::ERROR_V1_ID;
+
 use crate::error::{to_tool_error, to_tool_success};
 use crate::state::KebabAppState;

@@ -27,6 +31,22 @@ pub struct SearchInput {
    pub snippet_chars: Option<usize>,
    /// p9-fb-34: opaque cursor from a previous response.
    pub cursor: Option<String>,
+    /// p9-fb-36: filter by `metadata.tags` (OR-within).
+    pub tags: Option<Vec<String>>,
+    /// p9-fb-36: filter by `documents.lang` (ISO code).
+    pub lang: Option<String>,
+    /// p9-fb-36: filter by `documents.workspace_path` glob.
+    pub path_glob: Option<String>,
+    /// p9-fb-36: filter by minimum `documents.trust_level`.
+    /// Accepts: `"primary"`, `"secondary"`, `"generated"`.
+    pub trust_min: Option<String>,
+    /// p9-fb-36: filter by `assets.media_type` kind. IN-list. Accepts:
+    /// `"markdown"`, `"pdf"`, `"image"`, `"audio"`, `"other"`. Aliases: `md` → `markdown`.
+    pub media: Option<Vec<String>>,
+    /// p9-fb-36: RFC3339 UTC timestamp. Invalid format → invalid_input.
+    pub ingested_after: Option<String>,
+    /// p9-fb-36: filter to a single doc.
+    pub doc_id: Option<String>,
 }

 pub fn handle(state: &KebabAppState, input: SearchInput) -> CallToolResult {
@@ -37,11 +57,62 @@ pub fn handle(state: &KebabAppState, input: SearchInput) -> CallToolResult {
        "vector" => kebab_core::SearchMode::Vector,
        _ => kebab_core::SearchMode::Hybrid,
    };
+
+    // p9-fb-36: parse filter inputs, returning invalid_input on bad values.
+    let trust_min = match input.trust_min.as_deref() {
+        Some(s) => match s.to_ascii_lowercase().as_str() {
+            "primary" => Some(kebab_core::TrustLevel::Primary),
+            "secondary" => Some(kebab_core::TrustLevel::Secondary),
+            "generated" => Some(kebab_core::TrustLevel::Generated),
+            other => {
+                return invalid_input(&format!(
+                    "trust_min: unknown level '{other}'; expected primary|secondary|generated"
+                ));
+            }
+        },
+        None => None,
+    };
+
+    let ingested_after = match input.ingested_after.as_deref() {
+        Some(s) => {
+            match time::OffsetDateTime::parse(
+                s,
+                &time::format_description::well_known::Rfc3339,
+            ) {
+                Ok(ts) => Some(ts),
+                Err(e) => {
+                    return invalid_input(&format!(
+                        "ingested_after: invalid RFC3339 '{s}': {e}"
+                    ));
+                }
+            }
+        }
+        None => None,
+    };
+
+    let media: Vec<String> = input
+        .media
+        .clone()
+        .unwrap_or_default()
+        .iter()
+        .map(|s| normalize_media_alias(s))
+        .collect();
+
+    let filters = kebab_core::SearchFilters {
+        tags_any: input.tags.clone().unwrap_or_default(),
+        lang: input.lang.clone().map(kebab_core::Lang),
+        path_glob: input.path_glob.clone(),
+        trust_min,
+        media,
+        ingested_after,
+        doc_id: input.doc_id.clone().map(kebab_core::DocumentId),
+    };
+
    let query = kebab_core::SearchQuery {
        text: input.query,
        mode,
        k,
-        filters: kebab_core::SearchFilters::default(),
+        filters,
    };
    let opts = kebab_core::SearchOpts {
        max_tokens: input.max_tokens,
@@ -81,3 +152,22 @@ pub fn handle(state: &KebabAppState, input: SearchInput) -> CallToolResult {
        Err(e) => to_tool_error(&e),
    }
 }
+
+fn normalize_media_alias(s: &str) -> String {
+    match s.to_ascii_lowercase().as_str() {
+        "md" => "markdown".to_string(),
+        other => other.to_string(),
+    }
+}
+
+fn invalid_input(msg: &str) -> CallToolResult {
+    use kebab_app::{ErrorV1, StructuredError};
+    let err = anyhow::Error::new(StructuredError(ErrorV1 {
+        schema_version: ERROR_V1_ID.to_string(),
+        code: "invalid_input".to_string(),
+        message: msg.to_string(),
+        details: serde_json::Value::Null,
+        hint: None,
+    }));
+    to_tool_error(&err)
+}
--- a/crates/kebab-mcp/tests/tools_call_fetch.rs
+++ b/crates/kebab-mcp/tests/tools_call_fetch.rs
@@ -62,6 +62,13 @@ async fn fetch_tool_chunk_returns_fetch_result_v1() {
            max_tokens: None,
            snippet_chars: None,
            cursor: None,
+            tags: None,
+            lang: None,
+            path_glob: None,
+            trust_min: None,
+            media: None,
+            ingested_after: None,
+            doc_id: None,
        },
    );
    let search_text = match &search_result.content.first().unwrap().raw {
--- a/crates/kebab-mcp/tests/tools_call_search.rs
+++ b/crates/kebab-mcp/tests/tools_call_search.rs
@@ -58,6 +58,13 @@ async fn search_tool_returns_search_response_v1() {
            max_tokens: None,
            snippet_chars: None,
            cursor: None,
+            tags: None,
+            lang: None,
+            path_glob: None,
+            trust_min: None,
+            media: None,
+            ingested_after: None,
+            doc_id: None,
        },
    );

@@ -108,3 +115,175 @@ async fn search_tool_returns_search_response_v1() {
        "envelope should carry next_cursor (possibly null)"
    );
 }
+
+/// p9-fb-36: search with doc_id filter — only hits from the target doc.
+#[tokio::test]
+async fn search_with_doc_id_filter_returns_only_target() {
+    let dir = tempfile::tempdir().unwrap();
+    let data_dir = dir.path().join("data");
+    let workspace_root = dir.path().join("notes");
+    fs::create_dir_all(&data_dir).unwrap();
+    fs::create_dir_all(&workspace_root).unwrap();
+
+    let config = minimal_config(&data_dir, &workspace_root);
+
+    // Write two markdown documents, both containing the query term.
+    fs::write(
+        workspace_root.join("a.md"),
+        "# Alpha\n\nThis document mentions kebab and flatbread.",
+    )
+    .unwrap();
+    fs::write(
+        workspace_root.join("b.md"),
+        "# Beta\n\nAnother document about kebab wraps and fillings.",
+    )
+    .unwrap();
+
+    let scope = SourceScope {
+        root: workspace_root.clone(),
+        include: vec![],
+        exclude: vec![],
+    };
+    let _ = kebab_app::ingest_with_config(config.clone(), scope, false).unwrap();
+
+    let state = KebabAppState::new(config, None);
+    let handler = KebabHandler::new(state);
+
+    // First: unfiltered search to discover a doc_id from one of the docs.
+    let unfiltered = kebab_mcp::tools::search::handle(
+        handler.state(),
+        kebab_mcp::tools::search::SearchInput {
+            query: "kebab".to_string(),
+            mode: Some("lexical".to_string()),
+            k: Some(10),
+            max_tokens: None,
+            snippet_chars: None,
+            cursor: None,
+            tags: None,
+            lang: None,
+            path_glob: None,
+            trust_min: None,
+            media: None,
+            ingested_after: None,
+            doc_id: None,
+        },
+    );
+    assert!(
+        !unfiltered.is_error.unwrap_or(false),
+        "unfiltered search failed: {:?}",
+        unfiltered
+    );
+    let unfiltered_text = match &unfiltered.content.first().unwrap().raw {
+        RawContent::Text(t) => t.text.clone(),
+        other => panic!("expected text content, got {other:?}"),
+    };
+    let unfiltered_v: serde_json::Value = serde_json::from_str(&unfiltered_text).unwrap();
+    let hits = unfiltered_v["hits"].as_array().expect("hits must be array");
+    assert!(hits.len() >= 2, "expected hits from both docs");
+
+    // Pick the doc_id of the first hit.
+    let target_doc_id = hits[0]["doc_id"]
+        .as_str()
+        .expect("doc_id on first hit")
+        .to_string();
+
+    // Now search with doc_id filter — all results must belong to that doc.
+    let filtered = kebab_mcp::tools::search::handle(
+        handler.state(),
+        kebab_mcp::tools::search::SearchInput {
+            query: "kebab".to_string(),
+            mode: Some("lexical".to_string()),
+            k: Some(10),
+            max_tokens: None,
+            snippet_chars: None,
+            cursor: None,
+            tags: None,
+            lang: None,
+            path_glob: None,
+            trust_min: None,
+            media: None,
+            ingested_after: None,
+            doc_id: Some(target_doc_id.clone()),
+        },
+    );
+    assert!(
+        !filtered.is_error.unwrap_or(false),
+        "filtered search failed: {:?}",
+        filtered
+    );
+    let filtered_text = match &filtered.content.first().unwrap().raw {
+        RawContent::Text(t) => t.text.clone(),
+        other => panic!("expected text content, got {other:?}"),
+    };
+    let filtered_v: serde_json::Value = serde_json::from_str(&filtered_text).unwrap();
+    let filtered_hits = filtered_v["hits"].as_array().expect("hits must be array");
+
+    assert!(
+        !filtered_hits.is_empty(),
+        "expected at least one hit for target doc"
+    );
+    for hit in filtered_hits {
+        assert_eq!(
+            hit["doc_id"].as_str(),
+            Some(target_doc_id.as_str()),
+            "all filtered hits must belong to the target doc"
+        );
+    }
+}
+
+/// p9-fb-36: invalid RFC3339 for ingested_after → invalid_input error.v1.
+#[tokio::test]
+async fn search_with_invalid_ingested_after_returns_invalid_input() {
+    let dir = tempfile::tempdir().unwrap();
+    let data_dir = dir.path().join("data");
+    let workspace_root = dir.path().join("notes");
+    fs::create_dir_all(&data_dir).unwrap();
+    fs::create_dir_all(&workspace_root).unwrap();
+
+    let config = minimal_config(&data_dir, &workspace_root);
+    let state = KebabAppState::new(config, None);
+    let handler = KebabHandler::new(state);
+
+    let result = kebab_mcp::tools::search::handle(
+        handler.state(),
+        kebab_mcp::tools::search::SearchInput {
+            query: "kebab".to_string(),
+            mode: None,
+            k: None,
+            max_tokens: None,
+            snippet_chars: None,
+            cursor: None,
+            tags: None,
+            lang: None,
+            path_glob: None,
+            trust_min: None,
+            media: None,
+            ingested_after: Some("garbage".to_string()),
+            doc_id: None,
+        },
+    );
+
+    assert!(
+        result.is_error.unwrap_or(false),
+        "expected isError=true for invalid ingested_after"
+    );
+    let content = result
+        .content
+        .first()
+        .expect("expected at least one content item");
+    let text = match &content.raw {
+        RawContent::Text(t) => &t.text,
+        other => panic!("expected text content, got {other:?}"),
+    };
+    let v: serde_json::Value = serde_json::from_str(text).unwrap();
+    assert_eq!(
+        v.get("schema_version").and_then(|s| s.as_str()),
+        Some("error.v1"),
+        "must carry error.v1 envelope"
+    );
+    assert_eq!(
+        v.get("code").and_then(|s| s.as_str()),
+        Some("invalid_input"),
+        "code must be invalid_input for bad RFC3339"
+    );
+}
--- a/crates/kebab-search/src/lexical.rs
+++ b/crates/kebab-search/src/lexical.rs
@@ -319,6 +319,54 @@ fn run_query(
        };
        params.push(Box::new(rank));
    }
+    // p9-fb-36: media_type filter (IN-list).
+    // `assets.media_type` JSON has two shapes:
+    //   - unit variant (Markdown / Pdf): JSON text, e.g. `"markdown"`
+    //   - tuple variant (Image(Png) / Audio(Mp3) / Other(s)): JSON object,
+    //     e.g. `{"image": "png"}`
+    // Extract a unified "kind" string for both shapes via:
+    //   CASE WHEN json_type = 'text' THEN json_extract($)
+    //        ELSE (first object key)
+    //   END IN (?, ...)
+    if !filters.media.is_empty() {
+        let placeholders: Vec<&str> =
+            std::iter::repeat_n("?", filters.media.len()).collect();
+        let placeholders = placeholders.join(",");
+        sql.push_str(&format!(
+            " AND f.doc_id IN (\
+               SELECT d2.doc_id FROM documents d2 \
+               JOIN assets a ON a.asset_id = d2.asset_id \
+               WHERE CASE \
+                 WHEN json_type(a.media_type) = 'text' THEN json_extract(a.media_type, '$') \
+                 ELSE (SELECT key FROM json_each(a.media_type) LIMIT 1) \
+               END IN ({placeholders}))"
+        ));
+        for kind in &filters.media {
+            params.push(Box::new(kind.clone()));
+        }
+    }
+
+    // p9-fb-36: ingested_after filter.
+    // `documents.updated_at` is RFC3339 stored as TEXT (always UTC `Z` per
+    // fb-32 ingest path), so lexicographic >= compare is correct — but only
+    // when the filter instant is also formatted as UTC `Z`. A non-UTC offset
+    // (e.g. `+09:00`) would compare as ASCII after `Z` (0x2B < 0x5A) and
+    // produce wrong results. Convert to UTC before formatting.
+    if let Some(after) = &filters.ingested_after {
+        let formatted = after
+            .to_offset(time::UtcOffset::UTC)
+            .format(&time::format_description::well_known::Rfc3339)
+            .expect("OffsetDateTime (UTC) formats to RFC3339");
+        sql.push_str(" AND d.updated_at >= ?");
+        params.push(Box::new(formatted));
+    }
+
+    // p9-fb-36: doc_id filter — single-doc scoping.
+    if let Some(id) = &filters.doc_id {
+        sql.push_str(" AND d.doc_id = ?");
+        params.push(Box::new(id.0.clone()));
+    }
+
    // path_glob is intentionally NOT applied here — see module comment
    // on PATH_GLOB_OVERFETCH and the post-filter in `LexicalRetriever::search`.

--- a/crates/kebab-search/tests/common/mod.rs
+++ b/crates/kebab-search/tests/common/mod.rs
@@ -19,7 +19,9 @@ use std::sync::Arc;
 use kebab_config::Config;
 use kebab_core::{
    ChunkId, DocumentId, EmbeddingId, EmbeddingInput, EmbeddingKind,
-    EmbeddingModelId, EmbeddingVersion, IndexVersion, VectorRecord, VectorStore,
+    EmbeddingModelId, EmbeddingVersion, IndexVersion, MediaType,
+    Retriever, SearchFilters, SearchHit, SearchMode, SearchQuery,
+    VectorRecord, VectorStore,
 };
 use kebab_embed::{Embedder, MockEmbedder};
 use kebab_search::{LexicalRetriever, VectorRetriever};
@@ -173,6 +175,93 @@ impl HybridEnv {
        .unwrap();
    }

+    /// High-level helper: seed a doc with the default media type
+    /// (Markdown) and embed its text. Returns the `DocumentId` so
+    /// callers can use it in `doc_id` filter tests.
+    pub fn insert_doc(&self, path: &str, text: &str) -> DocumentId {
+        self.insert_doc_with_media(path, text, MediaType::Markdown)
+    }
+
+    /// High-level helper: seed a doc with an explicit `MediaType`.
+    /// The `media_type` is serialized to JSON (mirrors how
+    /// `DocumentStore::put_document` writes it) and stored in `assets`.
+    pub fn insert_doc_with_media(
+        &self,
+        path: &str,
+        text: &str,
+        media: MediaType,
+    ) -> DocumentId {
+        // Derive deterministic IDs from the path so repeated calls with
+        // the same path are idempotent (INSERT OR IGNORE).
+        let path_hash: String = {
+            use std::collections::hash_map::DefaultHasher;
+            use std::hash::{Hash, Hasher};
+            let mut h = DefaultHasher::new();
+            path.hash(&mut h);
+            format!("{:032x}", h.finish())
+        };
+        let doc_id = format!("d{}", &path_hash[..31]);
+        let chunk_id = format!("c{}", &path_hash[..31]);
+        let asset_id = format!("a{}", &path_hash[..31]);
+
+        let media_json = serde_json::to_string(&media).expect("serialize MediaType");
+        let conn = self.sqlite.read_conn();
+        conn.execute(
+            "INSERT OR IGNORE INTO assets (
+                asset_id, source_uri, workspace_path, media_type, byte_len,
+                checksum, storage_kind, storage_path, discovered_at
+             ) VALUES (?, ?, ?, ?, 0,
+                       'deadbeefdeadbeefdeadbeefdeadbeef',
+                       'reference', ?, '1970-01-01T00:00:00Z')",
+            params![
+                asset_id,
+                format!("file:///{path}"),
+                path,
+                media_json,
+                path,
+            ],
+        )
+        .unwrap();
+        conn.execute(
+            "INSERT OR IGNORE INTO documents (
+                doc_id, asset_id, workspace_path, title, lang, source_type,
+                trust_level, parser_version, doc_version, schema_version,
+                metadata_json, provenance_json, created_at, updated_at
+             ) VALUES (?, ?, ?, NULL, 'en', 'markdown', 'primary', 'v1', 1, 1,
+                       '{}', '{}', '1970-01-01T00:00:00Z', '1970-01-01T00:00:00Z')",
+            params![doc_id, asset_id, path],
+        )
+        .unwrap();
+        let heading_json = "[]";
+        conn.execute(
+            "INSERT OR IGNORE INTO chunks (
+                chunk_id, doc_id, text, heading_path_json, section_label,
+                source_spans_json, token_estimate, chunker_version,
+                policy_hash, block_ids_json, created_at
+             ) VALUES (?, ?, ?, ?, NULL,
+                       '[{\"kind\":\"line\",\"start\":1,\"end\":1}]',
+                       1, 'v1', 'h', '[]', '1970-01-01T00:00:00Z')",
+            params![chunk_id, doc_id, text, heading_json],
+        )
+        .unwrap();
+        drop(conn);
+        self.embed_and_upsert(&chunk_id, &doc_id, text, &[]);
+        DocumentId(doc_id)
+    }
+
+    /// Run a `SearchMode::Vector` query against the seeded corpus and
+    /// return the resulting `Vec<SearchHit>`.
+    pub fn run_vector_search(&self, query: &str, filters: &SearchFilters) -> Vec<SearchHit> {
+        let r = self.vector_retriever();
+        let q = SearchQuery {
+            text: query.to_string(),
+            mode: SearchMode::Vector,
+            k: 10,
+            filters: filters.clone(),
+        };
+        r.search(&q).expect("vector search")
+    }
+
    /// Embed `text` as a Document and upsert it as the embedding for
    /// `chunk_id`. Drives the same code path production uses:
    /// MockEmbedder → VectorRecord → LanceVectorStore::upsert →
--- a/crates/kebab-search/tests/hybrid.rs
+++ b/crates/kebab-search/tests/hybrid.rs
@@ -15,7 +15,7 @@ use common::{
    HybridEnv, id32, require_avx_or_panic, TEST_LEX_INDEX_VERSION, TEST_VEC_INDEX_VERSION,
 };
 use kebab_core::{
-    Retriever, SearchFilters, SearchHit, SearchMode, SearchQuery,
+    MediaType, Retriever, SearchFilters, SearchHit, SearchMode, SearchQuery,
 };
 use kebab_search::{FusionPolicy, HybridRetriever};
 use rusqlite::params;
@@ -213,6 +213,57 @@ fn hybrid_snapshot_run_1() {
    }
 }

+/// p9-fb-36: vector post-filter must pass `media` through `filter_chunks`.
+/// Seeding two docs (markdown + pdf) and filtering for pdf-only must
+/// return only the pdf chunk, proving `LanceVectorStore::search` →
+/// `SqliteStore::filter_chunks` correctly applies the media arm.
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn vector_filter_by_media() {
+    require_avx_or_panic();
+    let env = HybridEnv::new();
+    env.insert_doc_with_media("md1.md", "rust ownership", MediaType::Markdown);
+    env.insert_doc_with_media("doc.pdf", "rust pdf body", MediaType::Pdf);
+
+    let filters = SearchFilters {
+        media: vec!["pdf".to_string()],
+        ..Default::default()
+    };
+    let hits = env.run_vector_search("rust", &filters);
+    assert_eq!(hits.len(), 1, "media filter must keep only pdf chunk");
+    assert!(
+        hits[0].doc_path.0.ends_with(".pdf"),
+        "expected .pdf path, got: {}",
+        hits[0].doc_path.0
+    );
+}
+
+/// p9-fb-36: vector post-filter must pass `doc_id` through `filter_chunks`.
+/// Seeding two docs with shared text, filtering by one doc_id must return
+/// only chunks from that doc.
+#[test]
+#[ignore = "requires AVX-capable hardware (LanceDB)"]
+fn vector_filter_by_doc_id() {
+    require_avx_or_panic();
+    let env = HybridEnv::new();
+    let target = env.insert_doc("a.md", "shared knowledge");
+    env.insert_doc("b.md", "shared knowledge");
+
+    let filters = SearchFilters {
+        doc_id: Some(target.clone()),
+        ..Default::default()
+    };
+    let hits = env.run_vector_search("shared", &filters);
+    assert!(
+        !hits.is_empty(),
+        "doc_id filter must return hits for the target doc"
+    );
+    assert!(
+        hits.iter().all(|h| h.doc_id == target),
+        "all hits must belong to the target doc_id"
+    );
+}
+
 #[test]
 #[ignore = "requires AVX-capable hardware (LanceDB)"]
 fn vector_hit_carries_indexed_at() {
--- a/crates/kebab-search/tests/lexical.rs
+++ b/crates/kebab-search/tests/lexical.rs
@@ -8,11 +8,15 @@
 use std::sync::Arc;

 use kebab_config::Config;
-use kebab_core::{IndexVersion, Lang, Retriever, SearchFilters, SearchMode, SearchQuery, TrustLevel};
+use kebab_core::{
+    DocumentId, IndexVersion, Lang, MediaType, Retriever, SearchFilters, SearchHit, SearchMode,
+    SearchQuery, TrustLevel,
+};
 use kebab_search::LexicalRetriever;
 use kebab_store_sqlite::SqliteStore;
 use rusqlite::Connection;
 use tempfile::TempDir;
+use time::OffsetDateTime;

 // ── Test scaffolding ─────────────────────────────────────────────────────

@@ -679,6 +683,210 @@ fn search_hit_carries_indexed_at_from_documents_updated_at() {
    assert!(!hit.stale, "lexical retriever must default stale=false");
 }

+// ── TestEnv helper for fb-36 filter tests ───────────────────────────────
+
+/// Convenience wrapper over `Env` that exposes higher-level fixture helpers
+/// for the fb-36 filter tests.  Intentionally kept separate from `Env` so
+/// the original tests are untouched.
+struct TestEnv {
+    inner: Env,
+    counter: std::cell::Cell<u32>,
+}
+
+impl TestEnv {
+    fn new() -> Self {
+        Self {
+            inner: Env::new(),
+            counter: std::cell::Cell::new(0),
+        }
+    }
+
+    /// Allocate a fresh monotone counter suffix so every inserted doc / chunk
+    /// gets a unique 32-hex ID without the caller worrying about collisions.
+    fn next_id(&self, prefix: &str) -> String {
+        let n = self.counter.get();
+        self.counter.set(n + 1);
+        let suffix = format!("{prefix}{n:04}");
+        id32(&suffix)
+    }
+
+    /// Insert a markdown doc with the given `body` and return its `DocumentId`.
+    fn insert_doc(&self, path: &str, body: &str) -> DocumentId {
+        self.insert_doc_with_media(path, body, MediaType::Markdown)
+    }
+
+    /// Insert a doc whose `assets.media_type` JSON is set to the serialized
+    /// form of `media`.  The `documents.updated_at` defaults to now.
+    fn insert_doc_with_media(&self, path: &str, body: &str, media: MediaType) -> DocumentId {
+        self.insert_doc_full(path, body, media, OffsetDateTime::now_utc())
+    }
+
+    /// Insert a doc with an explicit `updated_at` timestamp (for
+    /// `ingested_after` filter tests).
+    fn insert_doc_with_updated_at(
+        &self,
+        path: &str,
+        body: &str,
+        updated_at: OffsetDateTime,
+    ) -> DocumentId {
+        self.insert_doc_full(path, body, MediaType::Markdown, updated_at)
+    }
+
+    fn insert_doc_full(
+        &self,
+        path: &str,
+        body: &str,
+        media: MediaType,
+        updated_at: OffsetDateTime,
+    ) -> DocumentId {
+        use time::format_description::well_known::Rfc3339;
+        let doc_id = self.next_id("doc");
+        let chunk_id = self.next_id("chk");
+        let asset_id = self.next_id("ast");
+        let media_json = serde_json::to_string(&media).expect("serialize MediaType");
+        let updated_at_str = updated_at.format(&Rfc3339).expect("format updated_at");
+
+        let conn = self.inner.raw_conn();
+        conn.execute(
+            "INSERT OR IGNORE INTO assets (
+                asset_id, source_uri, workspace_path, media_type, byte_len,
+                checksum, storage_kind, storage_path, discovered_at
+            ) VALUES (?, ?, ?, ?, 0,
+                      'd0', 'reference', ?, '2024-01-01T00:00:00Z')",
+            rusqlite::params![asset_id, format!("file:///{path}"), path, media_json, path],
+        )
+        .expect("insert asset");
+
+        conn.execute(
+            "INSERT INTO documents (
+                doc_id, asset_id, workspace_path, title, lang,
+                source_type, trust_level, parser_version,
+                doc_version, schema_version, metadata_json,
+                provenance_json, created_at, updated_at
+            ) VALUES (?, ?, ?, NULL, 'en', 'markdown', 'primary', 'pv1', 1, 1,
+                      '{}', '{\"events\":[]}',
+                      '2024-01-01T00:00:00Z', ?)",
+            rusqlite::params![doc_id, asset_id, path, updated_at_str],
+        )
+        .expect("insert document");
+
+        let empty_headings: Vec<&str> = vec![];
+        let heading_json = serde_json::to_string(&empty_headings).unwrap();
+        conn.execute(
+            "INSERT INTO chunks (
+                chunk_id, doc_id, text, heading_path_json, section_label,
+                source_spans_json, token_estimate, chunker_version,
+                policy_hash, block_ids_json, created_at
+            ) VALUES (?, ?, ?, ?, NULL,
+                      '[{\"kind\":\"line\",\"start\":1,\"end\":1}]',
+                      1, 'v1', 'h', '[]', '2024-01-01T00:00:00Z')",
+            rusqlite::params![chunk_id, doc_id, body, heading_json],
+        )
+        .expect("insert chunk");
+
+        DocumentId(doc_id)
+    }
+
+    fn run_search(&self, query: &str, filters: &SearchFilters) -> Vec<SearchHit> {
+        let r = self.inner.retriever();
+        let q = SearchQuery {
+            text: query.to_string(),
+            mode: SearchMode::Lexical,
+            k: 10,
+            filters: filters.clone(),
+        };
+        r.search(&q).expect("search")
+    }
+}
+
+// ── fb-36 filter tests ───────────────────────────────────────────────────
+
+#[test]
+fn lexical_filter_by_media() {
+    let env = TestEnv::new();
+    env.insert_doc_with_media("md1.md", "rust ownership", MediaType::Markdown);
+    env.insert_doc_with_media("doc.pdf", "rust pdf body", MediaType::Pdf);
+    let filters = SearchFilters {
+        media: vec!["pdf".to_string()],
+        ..Default::default()
+    };
+    let hits = env.run_search("rust", &filters);
+    assert_eq!(hits.len(), 1, "only pdf doc should match");
+    assert!(hits[0].doc_path.0.ends_with(".pdf"), "got: {}", hits[0].doc_path.0);
+}
+
+#[test]
+fn lexical_filter_by_ingested_after() {
+    let env = TestEnv::new();
+    env.insert_doc_with_updated_at(
+        "old.md",
+        "ingest test",
+        time::macros::datetime!(2020-01-01 00:00:00 UTC),
+    );
+    env.insert_doc_with_updated_at(
+        "new.md",
+        "ingest test",
+        time::macros::datetime!(2026-01-01 00:00:00 UTC),
+    );
+    let filters = SearchFilters {
+        ingested_after: Some(time::macros::datetime!(2025-01-01 00:00:00 UTC)),
+        ..Default::default()
+    };
+    let hits = env.run_search("ingest", &filters);
+    assert_eq!(hits.len(), 1, "only post-2025 doc matches");
+}
+
+#[test]
+fn lexical_filter_by_doc_id() {
+    let env = TestEnv::new();
+    let target = env.insert_doc("a.md", "shared term");
+    env.insert_doc("b.md", "shared term");
+    let filters = SearchFilters {
+        doc_id: Some(target.clone()),
+        ..Default::default()
+    };
+    let hits = env.run_search("shared", &filters);
+    assert!(!hits.is_empty(), "should get at least one hit for target doc");
+    for h in &hits {
+        assert_eq!(h.doc_id, target, "all hits must be from target doc");
+    }
+}
+
+#[test]
+fn lexical_filter_combinator_is_and() {
+    let env = TestEnv::new();
+    let target = env.insert_doc_with_media("a.md", "rust", MediaType::Markdown);
+    env.insert_doc_with_media("b.pdf", "rust", MediaType::Pdf);
+    let filters = SearchFilters {
+        media: vec!["markdown".to_string()],
+        doc_id: Some(target.clone()),
+        ..Default::default()
+    };
+    let hits = env.run_search("rust", &filters);
+    assert!(!hits.is_empty(), "target doc should match combined filter");
+    assert!(hits.iter().all(|h| h.doc_id == target));
+}
+
+#[test]
+fn lexical_filter_unknown_media_returns_empty() {
+    let env = TestEnv::new();
+    env.insert_doc("a.md", "rust");
+    let filters = SearchFilters {
+        media: vec!["nonexistent_kind".to_string()],
+        ..Default::default()
+    };
+    let hits = env.run_search("rust", &filters);
+    assert!(hits.is_empty(), "unknown media → no hits, no error");
+}
+
+#[test]
+fn lexical_empty_filters_match_default_behavior() {
+    let env = TestEnv::new();
+    env.insert_doc("a.md", "rust");
+    let with_default = env.run_search("rust", &SearchFilters::default());
+    assert!(!with_default.is_empty());
+}
+
 #[test]
 fn lexical_snapshot_run_1() {
    // Pinned snapshot. A small, deterministic corpus; the JSON shape of
--- a/crates/kebab-store-sqlite/src/filters.rs
+++ b/crates/kebab-store-sqlite/src/filters.rs
@@ -129,6 +129,51 @@ impl SqliteStore {
            }
        }

+        // p9-fb-36: media_type filter (IN-list).
+        // `assets.media_type` JSON has two shapes:
+        //   - unit variant (Markdown / Pdf / …): JSON text, e.g. `"markdown"`
+        //   - tuple variant (Image(Png) / Audio(Mp3) / Other(s)): JSON object,
+        //     e.g. `{"image": "png"}`
+        // Extract a unified "kind" string for both shapes; mirrors lexical.
+        if !filters.media.is_empty() {
+            let media_ph = std::iter::repeat_n("?", filters.media.len())
+                .collect::<Vec<_>>()
+                .join(",");
+            sql.push_str(&format!(
+                " AND d.doc_id IN (\
+                   SELECT d2.doc_id FROM documents d2 \
+                   JOIN assets a ON a.asset_id = d2.asset_id \
+                   WHERE CASE \
+                     WHEN json_type(a.media_type) = 'text' THEN json_extract(a.media_type, '$') \
+                     ELSE (SELECT key FROM json_each(a.media_type) LIMIT 1) \
+                   END IN ({media_ph}))"
+            ));
+            for kind in &filters.media {
+                bind.push(Box::new(kind.clone()));
+            }
+        }
+
+        // p9-fb-36: ingested_after filter.
+        // `documents.updated_at` is RFC3339 TEXT (UTC `Z` per fb-32);
+        // lexicographic >= compare is correct — but only when the filter
+        // instant is also formatted as UTC `Z`. A non-UTC offset (e.g.
+        // `+09:00`) would compare as ASCII after `Z` (0x2B < 0x5A) and
+        // produce wrong results. Convert to UTC before formatting.
+        if let Some(after) = &filters.ingested_after {
+            let formatted = after
+                .to_offset(time::UtcOffset::UTC)
+                .format(&time::format_description::well_known::Rfc3339)
+                .expect("OffsetDateTime (UTC) formats to RFC3339");
+            sql.push_str(" AND d.updated_at >= ?");
+            bind.push(Box::new(formatted));
+        }
+
+        // p9-fb-36: doc_id filter — single-doc scoping.
+        if let Some(id) = &filters.doc_id {
+            sql.push_str(" AND d.doc_id = ?");
+            bind.push(Box::new(id.0.clone()));
+        }
+
        // Optional path_glob: applied in Rust on the rows we get back,
        // not in SQL — matching `kb-search::lexical`'s post-filter so
        // the glob semantics are byte-identical between retrievers.
@@ -280,6 +325,89 @@ mod tests {
            .unwrap();
    }

+    /// Variant of `seed_committed` that accepts an explicit `media_type`
+    /// JSON string (e.g. `r#""markdown""#` or `r#""pdf""#`) and an
+    /// explicit `updated_at` RFC3339 string so the fb-36 filter tests can
+    /// exercise `media` and `ingested_after` without going through the full
+    /// ingest pipeline.
+    #[allow(clippy::too_many_arguments)]
+    fn seed_committed_full(
+        store: &SqliteStore,
+        chunk_id: &str,
+        doc_id: &str,
+        workspace_path: &str,
+        lang: &str,
+        tags: &[&str],
+        trust: &str,
+        media_type_json: &str,
+        updated_at: &str,
+    ) {
+        let asset_id = format!("a{}", &doc_id[..31]);
+        {
+            let conn = store.lock_conn();
+            conn.execute(
+                "INSERT INTO assets (
+                    asset_id, source_uri, workspace_path, media_type, byte_len,
+                    checksum, storage_kind, storage_path, discovered_at
+                 ) VALUES (?, ?, ?, ?, 0, 'deadbeefdeadbeefdeadbeefdeadbeef',
+                           'reference', ?, '1970-01-01T00:00:00Z')",
+                params![
+                    asset_id,
+                    format!("file://{workspace_path}"),
+                    workspace_path,
+                    media_type_json,
+                    workspace_path,
+                ],
+            )
+            .unwrap();
+            conn.execute(
+                "INSERT INTO documents (
+                    doc_id, asset_id, workspace_path, title, lang, source_type,
+                    trust_level, parser_version, doc_version, schema_version,
+                    metadata_json, provenance_json, created_at, updated_at
+                 ) VALUES (?, ?, ?, NULL, ?, 'markdown', ?, 'v1', 1, 1,
+                           '{}', '{}', '1970-01-01T00:00:00Z', ?)",
+                params![doc_id, asset_id, workspace_path, lang, trust, updated_at],
+            )
+            .unwrap();
+            for t in tags {
+                conn.execute(
+                    "INSERT INTO document_tags (doc_id, tag) VALUES (?, ?)",
+                    params![doc_id, t],
+                )
+                .unwrap();
+            }
+            conn.execute(
+                "INSERT INTO chunks (
+                    chunk_id, doc_id, text, heading_path_json, section_label,
+                    source_spans_json, token_estimate, chunker_version,
+                    policy_hash, block_ids_json, created_at
+                 ) VALUES (?, ?, 'hi', '[]', NULL, '[]', 1, 'v1', 'h', '[]',
+                           '1970-01-01T00:00:00Z')",
+                params![chunk_id, doc_id],
+            )
+            .unwrap();
+        }
+
+        let embed_row = EmbeddingRecordRow {
+            embedding_id: format!("e{}", &chunk_id[..31]),
+            chunk_id: chunk_id.to_string(),
+            model_id: "m".to_string(),
+            model_version: "v1".to_string(),
+            dimensions: 4,
+            lance_table: "t".to_string(),
+            created_at: OffsetDateTime::UNIX_EPOCH,
+        };
+        store
+            .put_embedding_records_pending(std::slice::from_ref(&embed_row))
+            .unwrap();
+        store
+            .mark_embedding_records_committed(std::slice::from_ref(
+                &embed_row.embedding_id,
+            ))
+            .unwrap();
+    }
+
    fn cid(s: &str) -> ChunkId {
        ChunkId(s.to_string())
    }
@@ -449,4 +577,147 @@ mod tests {
        let out = store.filter_chunks(&[], &SearchFilters::default()).unwrap();
        assert!(out.is_empty());
    }
+
+    // ── p9-fb-36 new filter arms ─────────────────────────────────────────
+
+    #[test]
+    fn filter_chunks_media_type_keeps_matching_kind() {
+        // c1 = markdown, c2 = pdf. Filter for pdf → only c2 survives.
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        let c1 = "11111111111111111111111111111111";
+        let c2 = "22222222222222222222222222222222";
+        seed_committed_full(
+            &store, c1, "d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1",
+            "notes/a.md", "en", &[], "primary",
+            r#""markdown""#,
+            "1970-01-01T00:00:00Z",
+        );
+        seed_committed_full(
+            &store, c2, "d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2",
+            "notes/b.pdf", "en", &[], "primary",
+            r#""pdf""#,
+            "1970-01-01T00:00:00Z",
+        );
+
+        let f = SearchFilters {
+            media: vec!["pdf".to_string()],
+            ..Default::default()
+        };
+        let out = store
+            .filter_chunks(&[cid(c1), cid(c2)], &f)
+            .unwrap();
+        assert_eq!(out, vec![cid(c2)], "only pdf chunk should survive media filter");
+    }
+
+    #[test]
+    fn filter_chunks_ingested_after_excludes_old_docs() {
+        // c1 ingested 2020, c2 ingested 2026.  filter ingested_after=2025 → only c2.
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        let c1 = "11111111111111111111111111111111";
+        let c2 = "22222222222222222222222222222222";
+        seed_committed_full(
+            &store, c1, "d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1",
+            "old.md", "en", &[], "primary",
+            r#""markdown""#,
+            "2020-01-01T00:00:00Z",
+        );
+        seed_committed_full(
+            &store, c2, "d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2",
+            "new.md", "en", &[], "primary",
+            r#""markdown""#,
+            "2026-01-01T00:00:00Z",
+        );
+
+        let f = SearchFilters {
+            ingested_after: Some(time::macros::datetime!(2025-01-01 00:00:00 UTC)),
+            ..Default::default()
+        };
+        let out = store
+            .filter_chunks(&[cid(c1), cid(c2)], &f)
+            .unwrap();
+        assert_eq!(out, vec![cid(c2)], "only post-2025 chunk should survive ingested_after filter");
+    }
+
+    #[test]
+    fn filter_chunks_doc_id_scopes_to_single_doc() {
+        // c1 belongs to d1, c2 belongs to d2. filter doc_id=d1 → only c1.
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        let c1 = "11111111111111111111111111111111";
+        let c2 = "22222222222222222222222222222222";
+        let d1 = "d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1";
+        seed_committed_full(
+            &store, c1, d1,
+            "a.md", "en", &[], "primary",
+            r#""markdown""#,
+            "1970-01-01T00:00:00Z",
+        );
+        seed_committed_full(
+            &store, c2, "d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2",
+            "b.md", "en", &[], "primary",
+            r#""markdown""#,
+            "1970-01-01T00:00:00Z",
+        );
+
+        let f = SearchFilters {
+            doc_id: Some(kebab_core::DocumentId(d1.to_string())),
+            ..Default::default()
+        };
+        let out = store
+            .filter_chunks(&[cid(c1), cid(c2)], &f)
+            .unwrap();
+        assert_eq!(out, vec![cid(c1)], "doc_id filter must scope to the target doc only");
+    }
+
+    #[test]
+    fn filter_chunks_ingested_after_non_utc_offset_compares_as_instant() {
+        // Regression test for the non-UTC offset lex-compare bug.
+        //
+        // Scenario (from PR #127 review):
+        //   - doc stored at `2026-04-01T01:00:00Z`
+        //   - filter: `2026-04-01T05:00:00+09:00` == `2026-03-31T20:00:00Z` instant
+        //
+        // The doc instant (01:00 UTC on Apr 1) is AFTER the filter instant
+        // (20:00 UTC on Mar 31), so the doc SHOULD match.
+        //
+        // Buggy code: formats `+09:00` as-is → lex compare
+        //   `2026-04-01T01:00:00Z` vs `2026-04-01T05:00:00+09:00`
+        //   `01` < `05` → doc dropped incorrectly.
+        //
+        // Fixed code: converts to UTC first → compares
+        //   `2026-04-01T01:00:00Z` vs `2026-03-31T20:00:00Z`
+        //   Apr 1 > Mar 31 → doc correctly included.
+        let tmp = TempDir::new().unwrap();
+        let store = open_store(&tmp);
+        let c1 = "11111111111111111111111111111111";
+        seed_committed_full(
+            &store, c1, "d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1d1",
+            "doc.md", "en", &[], "primary",
+            r#""markdown""#,
+            "2026-04-01T01:00:00Z",
+        );
+
+        // Filter instant: 2026-04-01T05:00:00+09:00 == 2026-03-31T20:00:00 UTC.
+        // Doc (2026-04-01T01:00:00Z) is after the filter instant → should match.
+        let filter_instant = time::OffsetDateTime::parse(
+            "2026-04-01T05:00:00+09:00",
+            &time::format_description::well_known::Rfc3339,
+        )
+        .expect("valid RFC3339 with +09:00 offset");
+
+        let f = SearchFilters {
+            ingested_after: Some(filter_instant),
+            ..Default::default()
+        };
+        let out = store
+            .filter_chunks(&[cid(c1)], &f)
+            .unwrap();
+        assert_eq!(
+            out,
+            vec![cid(c1)],
+            "doc ingested at 01:00Z should match filter 05:00+09:00 (== 20:00Z previous day)"
+        );
+    }
 }
--- a/docs/SMOKE.md
+++ b/docs/SMOKE.md
@@ -190,6 +190,22 @@ kebab fetch span "$DOC_ID" 1 5 --json | jq '{line_start, line_end, effective_end

 PDF / audio docs reject `fetch span` with `error.v1.code = span_not_supported` — use `fetch chunk` (PDF chunks are page-aligned) or `fetch doc` instead.

+### Filter args (fb-36)
+
+````bash
+# Filter by media kind (md alias normalizes to markdown).
+kebab search "rust" --media md --json | jq '.hits | length'
+
+# Filter by ingest timestamp (RFC3339).
+kebab search "rust" --ingested-after 2026-04-01T00:00:00Z --json
+
+# Combine: doc-id scope + tag (AND across flags).
+kebab search "rust" --doc-id "<doc-id>" --tag rust --json
+````
+
+Bad `--ingested-after` → `error.v1.code = config_invalid`, exit 2.
+Unknown `--media` value → silently empty (no error).
+
 ## P6-4 이미지 ingestion 옵션

 `config.toml` 에 다음 절을 추가하면 `kebab ingest` 가 `**/*.png` / `**/*.jpg` 등 이미지 자산도 함께 색인합니다 (텍스트만 색인하려면 생략):
--- a/docs/superpowers/plans/2026-05-10-p9-fb-36-search-filters.md
+++ b/docs/superpowers/plans/2026-05-10-p9-fb-36-search-filters.md
--- a/docs/superpowers/specs/2026-05-10-p9-fb-36-search-filters-design.md
+++ b/docs/superpowers/specs/2026-05-10-p9-fb-36-search-filters-design.md
@@ -0,0 +1,213 @@
+---
+title: "p9-fb-36 — Search filter args design"
+phase: P9
+component: kebab-core + kebab-search + kebab-cli + kebab-mcp
+task_id: p9-fb-36
+status: design
+target_version: 0.5.0
+contract_source: ../../docs/superpowers/specs/2026-04-27-kebab-final-form-design.md
+contract_sections: [§4 search]
+date: 2026-05-10
+---
+
+# p9-fb-36 — Search filter args
+
+## Goal
+
+agent / 사용자가 검색 범위를 좁힐 수 있도록 CLI / MCP 에 filter flag 추가. 기존 `SearchFilters` 도메인 type 의 4 필드 (tags_any / lang / path_glob / trust_min) 를 CLI 표면에 노출하고, 신규 3 필드 (media / ingested_after / doc_id) 추가. wire schema 변경 없음 (input-only). filter 적용 layer = SQLite WHERE (lexical) + over-fetch + post-filter (vector). AND 조합 의미 고정.
+
+## Behavior contract
+
+### CLI flags on `kebab search`
+
+7 flags 추가, 모두 optional. 비어있으면 미적용 (기존 동작 보존):
+
+| flag | 의미 | repeat? |
+|------|------|---------|
+| `--tag <name>` | doc 의 `metadata.tags` 안에 매칭 (OR-within) | yes (`--tag rust --tag async` = `tag IN (rust,async)`) |
+| `--lang <iso>` | `documents.lang` 정확 매칭 | no |
+| `--path-glob <pattern>` | `documents.workspace_path` glob 매칭 | no |
+| `--trust-min <level>` | `documents.trust_level >= level` (enum 순서) | no |
+| `--media <csv>` | `assets.media_type.kind` IN 리스트 (예: `--media md,pdf`) | csv |
+| `--ingested-after <RFC3339>` | `documents.updated_at >= timestamp` | no |
+| `--doc-id <id>` | `documents.doc_id = id` | no |
+
+다중 flag 조합 = AND 결합. 각 flag 안 다중 값 (--tag, --media) = OR.
+
+### Filter validation
+
+- `--ingested-after` RFC3339 파싱 실패 → CLI 진입 시 `error.v1.code = config_invalid`, exit 2.
+- `--media` 의 unknown value (예: `--media foo`) → 매칭 0건 (filter unmatch). 명시적 거절 안 함 (lenient).
+- `--trust-min` clap value_enum 검증 (enum 외 거절).
+- `--doc-id` 형식 검증 안 함 (DocumentId 는 단순 string wrapper). 존재하지 않으면 매칭 0건.
+
+### Filter layer
+
+**Lexical (lexical.rs)**:
+- 기존 SQL builder 의 WHERE 절 확장. `media` / `ingested_after` / `doc_id` 모두 SQL 구문 가능.
+- `media`: `JOIN assets a ON a.asset_id = d.asset_id` + `json_extract(a.media_type, '$.kind') IN (?, ?)` (다중 값).
+- `ingested_after`: `d.updated_at >= ?` (RFC3339 lexicographic compare; UTC `Z` 가정).
+- `doc_id`: `d.doc_id = ?`.
+- path_glob 은 기존 post-filter 그대로.
+
+**Vector (vector.rs)**:
+- 기존 over-fetch (k * 2) + `filter_chunks` 헬퍼에서 SQLite chunks JOIN documents JOIN assets.
+- 같은 WHERE 조건 적용. k 부족 시 truncated.
+
+### Wire shape
+
+기존 wire schema 변경 없음.
+
+- `search_response.v1` (output) — 그대로.
+- `search_hit.v1` (개별 hit) — 그대로.
+- 입력 측 (CLI args / MCP `SearchInput`) 만 확장.
+
+MCP `SearchInput` schema 는 `schemars` derive 로 자동 갱신. 수동 schema 파일 X.
+
+### MCP `SearchInput` 확장
+
+```rust
+pub struct SearchInput {
+    pub query: String,
+    pub mode: Option<String>,
+    pub k: Option<usize>,
+    pub max_tokens: Option<usize>,    // fb-34
+    pub snippet_chars: Option<usize>, // fb-34
+    pub cursor: Option<String>,       // fb-34
+    // p9-fb-36 신규 (모두 optional)
+    pub tags: Option<Vec<String>>,
+    pub lang: Option<String>,
+    pub path_glob: Option<String>,
+    pub trust_min: Option<String>,    // "low" | "medium" | "high"
+    pub media: Option<Vec<String>>,
+    pub ingested_after: Option<String>,  // RFC3339
+    pub doc_id: Option<String>,
+}
+```
+
+input → `SearchFilters` 변환 시 위와 동일 검증 (RFC3339 파싱, trust_level enum). 실패 시 `invalid_input` ErrorV1.
+
+## Allowed / forbidden dependencies
+
+- `kebab-core`: 신규 dep 없음. 기존 type 확장만.
+- `kebab-search`: 변경 없음 (SQL builder 안 WHERE 추가만).
+- `kebab-cli`: clap flag 추가, dispatch 변환.
+- `kebab-mcp`: SearchInput 확장.
+- `kebab-tui`: 변경 없음.
+
+`kebab-core` 의 다른 `kebab-*` crate 의존 금지 룰 그대로.
+
+## Public surface delta
+
+### kebab-core
+
+```rust
+#[derive(Clone, Debug, Default, PartialEq, Serialize, Deserialize)]
+pub struct SearchFilters {
+    pub tags_any: Vec<String>,
+    pub lang: Option<Lang>,
+    pub path_glob: Option<String>,
+    pub trust_min: Option<TrustLevel>,
+    /// p9-fb-36: media_type filter — IN-list of `MediaType.kind` strings
+    /// (e.g. `["markdown", "pdf"]`). Empty Vec = no filter.
+    #[serde(default)]
+    pub media: Vec<String>,
+    /// p9-fb-36: hits whose source doc's `documents.updated_at` is at
+    /// or after this timestamp. None = no filter. RFC3339 / UTC.
+    #[serde(default, with = "time::serde::rfc3339::option")]
+    pub ingested_after: Option<OffsetDateTime>,
+    /// p9-fb-36: restrict hits to a single document. None = no filter.
+    #[serde(default)]
+    pub doc_id: Option<DocumentId>,
+}
+```
+
+`#[serde(default)]` on each new field = backwards-compat (older JSON without these keys deserializes as defaults).
+
+### kebab-search (lexical + vector)
+
+내부 SQL builder 확장만. public API 변경 없음.
+
+### kebab-cli (`Cmd::Search`)
+
+```rust
+Cmd::Search {
+    // 기존
+    query, k, mode, explain, no_cache,
+    max_tokens, snippet_chars, cursor,   // fb-34
+    // p9-fb-36 신규
+    #[arg(long)] tag: Vec<String>,
+    #[arg(long)] lang: Option<String>,
+    #[arg(long)] path_glob: Option<String>,
+    #[arg(long, value_enum)] trust_min: Option<TrustLevelFlag>,
+    #[arg(long, value_delimiter = ',')] media: Vec<String>,
+    #[arg(long)] ingested_after: Option<String>,
+    #[arg(long)] doc_id: Option<String>,
+}
+```
+
+`TrustLevelFlag` 신규 clap value_enum (CLI-internal, kebab-core 의 `TrustLevel` 로 변환).
+
+### kebab-mcp::tools::search
+
+`SearchInput` 7 optional 필드 추가 (위 §MCP `SearchInput` 확장). dispatch 에서 `SearchFilters` 빌드 + 검증.
+
+## Test plan
+
+| kind | description |
+|------|-------------|
+| unit (kebab-core) | `SearchFilters::default()` — 7 필드 모두 비어있음 |
+| unit (kebab-search/lexical) | `media: ["pdf"]` — markdown doc 안 잡힘 |
+| unit (kebab-search/lexical) | `media: ["markdown", "pdf"]` — IN-list 동작 |
+| unit (kebab-search/lexical) | `ingested_after: <어제>` — 어제 이전 doc 안 잡힘 |
+| unit (kebab-search/lexical) | `doc_id: <X>` — 다른 doc 의 chunk 안 잡힘 |
+| unit (kebab-search/lexical) | 다중 filter AND — 모두 만족하는 hit 만 |
+| unit (kebab-search/lexical) | 빈 filter (default) — 기존 동작과 동일 |
+| unit (kebab-search/vector) | 동일 패턴 — `filter_chunks` post-filter |
+| unit (kebab-search) | 알 수 없는 media 값 (`["foo"]`) — empty result, no error |
+| 통합 (kebab-cli) | `kebab search Q --media md --json` wire shape (search_response.v1 그대로) |
+| 통합 (kebab-cli) | `kebab search Q --ingested-after 2020-01-01 --json` 모든 hit 통과 |
+| 통합 (kebab-cli) | `kebab search Q --ingested-after garbage --json` → `error.v1.code = config_invalid` exit 2 |
+| 통합 (kebab-cli) | `kebab search Q --doc-id <id> --json` 단일 doc 만 |
+| 통합 (kebab-cli) | `kebab search Q --tag rust --tag async --json` IN-list 동작 |
+| 통합 (kebab-mcp) | `mcp__kebab__search` 7 optional 필드 모두 정상 응답 |
+| 통합 (kebab-mcp) | `mcp__kebab__search` invalid `ingested_after` → invalid_input |
+
+## Implementation steps (high-level)
+
+1. `kebab-core::SearchFilters` 3 필드 추가 + 단위 테스트.
+2. `kebab-search/lexical.rs` SQL builder 확장 + 단위 테스트.
+3. `kebab-search/vector.rs` `filter_chunks` 헬퍼 동일 확장 + 단위 테스트.
+4. `kebab-cli::Cmd::Search` 7 flag 추가 + dispatch + RFC3339 파싱.
+5. `kebab-cli` 통합 테스트 (lexical-only, no Ollama).
+6. `kebab-mcp::tools::search::SearchInput` 7 필드 + dispatch + invalid_input 검증.
+7. `kebab-mcp` 통합 테스트.
+8. README + SMOKE — filter 예시.
+9. tasks/INDEX.md / spec status flip.
+10. SKILL.md — `mcp__kebab__search` input shape 갱신.
+
+## Risks / notes
+
+- **`assets.media_type` JSON shape**: `MediaType` enum 의 serde 직렬화 형태가 `{"kind": "markdown"}` 인지, 다른 형태인지 SQLite 저장 형식 확인 필요. `Markdown` 같은 unit variant 는 `"markdown"` 문자열, `Image(...)` / `Audio(...)` 같은 tuple variant 는 `{"image": {...}}` 형태일 가능성. `json_extract` 경로를 그에 맞춰 조정 (e.g. `case when typeof(...) = 'text' then ... else json_extract($.kind) end`).
+- **RFC3339 lexicographic compare**: ingest 시 항상 UTC `Z` 로 저장 (fb-32 ingest path 확인됨). 외부 도구가 다른 offset 으로 강제 update 시 비교 부정확. spec 에 "UTC `Z` 가정" 명시.
+- **path_glob 과 다른 filter 의 ordering**: path_glob 은 post-filter (lexical), 신규 3 개는 SQL — fetch_limit 도달 후 path_glob 으로 추가 cut → final hit 수가 줄 수 있음. 기존 동작과 동일 (path_glob 패턴 유지).
+- **clap `Vec<String>` 의 default**: clap 0.4 에서 미지정 = `Vec::new()`. 자동.
+- **trust_min enum 매핑**: clap value_enum 으로 안전. `TrustLevelFlag` → `TrustLevel` 변환 헬퍼.
+- **SearchFilters serde backwards-compat**: `#[serde(default)]` 로 옛 JSON 무영향. SQLite 안 SearchFilters 직렬 저장 안 함 (request-time only).
+
+## Out of scope
+
+- `--exclude-doc-id` / `--exclude-tag` (exclusion filter).
+- 다중 doc_id (`--doc-id a --doc-id b`) — 단일만.
+- TUI Search 패널 filter UI.
+- Lance metadata pre-filter.
+- tag 시스템 신규 도입 (이미 존재).
+- `--search.default-filter` config (default 값 지정) — agent 가 매번 명시.
+
+## Documentation updates (implementation PR 동시)
+
+- `README.md` — `kebab search` row 의 flag 표기에 7 flag 추가.
+- `docs/SMOKE.md` — filter walkthrough (`--media md --ingested-after 2026-04-01` 예시).
+- `tasks/p9/p9-fb-36-search-filters.md` — `status: open → completed`, design/plan 링크.
+- `tasks/INDEX.md` — fb-36 행 ✅.
+- `integrations/claude-code/kebab/SKILL.md` — `mcp__kebab__search` input shape 갱신 (7 필드 명시 + AND 의미 + lenient unknown media).
--- a/integrations/claude-code/kebab/SKILL.md
+++ b/integrations/claude-code/kebab/SKILL.md
@@ -48,11 +48,12 @@ Use when the user wants to **find** a doc, or when you (the model) need raw chun

 Input:
 ```json
-{ "query": "<query>", "mode": "hybrid", "k": 10, "max_tokens": null, "snippet_chars": null, "cursor": null }
+{ "query": "<query>", "mode": "hybrid", "k": 10, "max_tokens": null, "snippet_chars": null, "cursor": null, "tags": null, "lang": null, "path_glob": null, "trust_min": null, "media": null, "ingested_after": null, "doc_id": null }
 ```

 - `mode = "hybrid"` is the default-correct choice. Use `"vector"` for semantic-only ("docs about X concept"), `"lexical"` for exact strings ("the literal flag `--foo-bar`").
 - **`max_tokens` / `snippet_chars` / `cursor` (p9-fb-34)** — agent budget controls. Set `max_tokens` to cap result wire size (chars/4 estimate); set `cursor` to the previous response's `next_cursor` to fetch the next page.
+- **p9-fb-36 filter inputs:** `tags` (string array — OR-within, AND across keys), `lang` (BCP-47 language code), `path_glob` (glob pattern matched against doc path), `trust_min` (`"primary"` | `"secondary"` | `"generated"` — includes that level and above), `media` (string array — IN-list of `"markdown"` | `"pdf"` | `"image"` | `"audio"` | `"other"`; alias `"md"` → `"markdown"`), `ingested_after` (RFC3339 UTC string), `doc_id` (exact doc UUID). AND combinator across keys. Invalid `ingested_after` or unknown `trust_min` → `error.v1.code = invalid_input`. Unknown `media` value → empty hits, no error.
 - Output is `search_response.v1`: `{ hits: search_hit.v1[], next_cursor: string|null, truncated: bool }`. Iterate `response.hits[]` for individual hits. Key hit fields: `rank`, `score`, `doc_path`, `heading_path[]`, `section_label`, `snippet`, `citation` (line range / page), `chunk_id`.
 - Cite back to the user as `doc_path § heading_path[-1]` so they can open the source.
 - When `truncated: true`, the budget loop modified the page (snippet shortening or k reduction). `next_cursor` is **independent** — non-null whenever more hits may be reachable. Caller may widen `max_tokens` (re-issue same query for fuller snippets / more hits per page) or follow `next_cursor` (advance through more hits) or both. Mismatched cursor (corpus_revision changed) returns `error.v1.code = stale_cursor` — re-issue the search to obtain a fresh one.
--- a/tasks/INDEX.md
+++ b/tasks/INDEX.md
@@ -124,7 +124,7 @@ P0~P5 는 직렬. P6~P9 는 P5 이후 병렬 가능.
    - [p9-fb-33 streaming ask (ndjson delta)](p9/p9-fb-33-streaming-ask.md) — ✅ 머지 + v0.5.0 cut 후보 (2026-05-09)
    - [p9-fb-34 output budget controls](p9/p9-fb-34-output-budget-controls.md) — ✅ 머지 + v0.5.0 cut 후보 (2026-05-09)
    - [p9-fb-35 verbatim fetch](p9/p9-fb-35-verbatim-fetch.md) — ✅ 머지 + v0.5.0 cut 후보 (2026-05-09)
-    - [p9-fb-36 search filter args](p9/p9-fb-36-search-filters.md) — ⏳ 미구현, brainstorm 필요
+    - [p9-fb-36 search filter args](p9/p9-fb-36-search-filters.md) — ✅ 머지 (2026-05-10)
    - [p9-fb-37 trace + stats](p9/p9-fb-37-trace-and-stats.md) — ⏳ 미구현, brainstorm 필요 (depends_on 27)

    ### 🎯 0.5.0 — RAG quality (cascade 동반: V00X + reindex)
--- a/tasks/p9/p9-fb-36-search-filters.md
+++ b/tasks/p9/p9-fb-36-search-filters.md
@@ -3,7 +3,7 @@ phase: P9
 component: kebab-cli + kebab-search + wire-schema
 task_id: p9-fb-36
 title: "Search filter args (--media / --ingested-after / --doc-id / --tag)"
-status: open
+status: completed
 target_version: 0.4.0
 depends_on: []
 unblocks: []
@@ -14,7 +14,10 @@ source_feedback: 사용자 도그푸딩 2026-05-06 — agent 가 검색 범위

 # p9-fb-36 — Search filter args

-> ⏳ **백로그 only — 미구현.** 본 spec 은 도그푸딩 피드백 skeleton. 구현 착수 전 [superpowers:brainstorming](../../docs/superpowers/) 으로 설계 단계 선행 필요. filter 종류 / SQLite 쿼리 통합 / Lance vector 필터 적용 layer brainstorm 후 확정.
+> ✅ **구현 완료.** 본 spec 은 구현 시점의 frozen 상태. post-merge deviation 은 [HOTFIXES.md](../HOTFIXES.md) 참조.
+
+상세 설계: `docs/superpowers/specs/2026-05-10-p9-fb-36-search-filters-design.md`.
+구현 계획: `docs/superpowers/plans/2026-05-10-p9-fb-36-search-filters.md`.

 ## 증상 / 동기