Compare commits

...

3 Commits

Author SHA1 Message Date
c17d6e67a8 Merge pull request 'feat(embed): candle Metal (Apple Silicon GPU) opt-in build feature' (#200) from feat/embed-candle-metal into main
Reviewed-on: #200
2026-06-02 11:40:52 +00:00
af8fd34716 docs(embed): README 에 cargo install --features embed_metal 안내 추가
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 11:38:28 +00:00
369aeb3d24 feat(embed): candle Metal (Apple Silicon GPU) opt-in build feature + v0.23.0
- kebab-embed-candle: `metal` feature → candle metal backend; select_device()
  picks Device::new_metal(0) (CPU fallback) under the feature, else Device::Cpu.
  .contiguous() before to_vec2 (Metal rejects strided views; CPU tolerates).
- feature passthrough: kebab-app/embed_metal → kebab-cli/embed_metal.
  Build on macOS: cargo build --release --features embed_metal.
- default (non-metal) path unchanged: clippy 0, candle units + thread_cap + parity pass.
- README + HOTFIXES: Mac-GPU-ingest → copy sqlite+lancedb → server CPU-query workflow.
- version 0.22.0 → 0.23.0 (opt-in build surface).

macOS-only compile; Metal execution/speed/parity validated by user on M4 Pro
(not buildable on the Linux CI/dev machine).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 11:37:08 +00:00
8 changed files with 495 additions and 50 deletions

452
Cargo.lock generated
View File

@@ -712,6 +712,12 @@ dependencies = [
"cpufeatures 0.3.0",
]
[[package]]
name = "block"
version = "0.1.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0d8c1fef690941d3e7788d328517591fecc684c084084702d6ff1641e993699a"
[[package]]
name = "block-buffer"
version = "0.10.4"
@@ -895,23 +901,42 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6bd9895436c1ba5dc1037a19935d084b838db066ff4e15ef7dded020b7c12a4a"
dependencies = [
"byteorder",
"candle-metal-kernels",
"candle-ug",
"float8",
"gemm",
"gemm 0.19.0",
"half",
"libm",
"memmap2",
"num-traits",
"num_cpus",
"objc2-foundation",
"objc2-metal",
"rand 0.9.4",
"rand_distr 0.5.1",
"rayon",
"safetensors",
"safetensors 0.7.0",
"thiserror 2.0.18",
"tokenizers 0.22.2",
"yoke",
"yoke 0.8.2",
"zip",
]
[[package]]
name = "candle-metal-kernels"
version = "0.10.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4b6b5a4cae6b4e1ab0efcee4dc05272d11b374a3d1ba121b3a961e36be54ab60"
dependencies = [
"half",
"objc2",
"objc2-foundation",
"objc2-metal",
"once_cell",
"thiserror 2.0.18",
"tracing",
]
[[package]]
name = "candle-nn"
version = "0.10.2"
@@ -919,11 +944,13 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a9317a09d6530b758990ed7f625ac69ff43653bc9ee28b0464644ad1169ada87"
dependencies = [
"candle-core",
"candle-metal-kernels",
"half",
"libc",
"num-traits",
"objc2-metal",
"rayon",
"safetensors",
"safetensors 0.7.0",
"serde",
"thiserror 2.0.18",
]
@@ -947,6 +974,16 @@ dependencies = [
"tracing",
]
[[package]]
name = "candle-ug"
version = "0.10.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ca0fc3167cbc99c8ec1be618cb620aa21dca95038f118c3579a79370e3dc5f77"
dependencies = [
"ug",
"ug-metal",
]
[[package]]
name = "cassowary"
version = "0.3.0"
@@ -1210,6 +1247,17 @@ version = "0.8.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "773648b94d0e5d620f64f280777445740e61fe701025087ec8b57f45c791888b"
[[package]]
name = "core-graphics-types"
version = "0.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "45390e6114f68f718cc7a830514a96f903cccd70d02a8f6d9f643ac4ba45afaf"
dependencies = [
"bitflags 1.3.2",
"core-foundation 0.9.4",
"libc",
]
[[package]]
name = "counter"
version = "0.7.1"
@@ -2637,7 +2685,28 @@ version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f6f339eb8adc052cd2ca78910fda869aefa38d22d5cb648e6485e4d3fc06f3b1"
dependencies = [
"foreign-types-shared",
"foreign-types-shared 0.1.1",
]
[[package]]
name = "foreign-types"
version = "0.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d737d9aa519fb7b749cbc3b962edcf310a8dd1f4b67c91c4f83975dbdd17d965"
dependencies = [
"foreign-types-macros",
"foreign-types-shared 0.3.1",
]
[[package]]
name = "foreign-types-macros"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1a5c6c585bc94aaf2c7b51dd4c2ba22680844aba4c687be581871a6f518c5742"
dependencies = [
"proc-macro2",
"quote",
"syn 2.0.117",
]
[[package]]
@@ -2646,6 +2715,12 @@ version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "00b0228411908ca8685dba7fc2cdd70ec9990a6e753e89b6ac91a84c40fbaf4b"
[[package]]
name = "foreign-types-shared"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "aa9a19cbb55df58761df49b23516a86d432839add4af60fc256da840f66ed35b"
[[package]]
name = "form_urlencoded"
version = "1.2.2"
@@ -2784,6 +2859,26 @@ dependencies = [
"slab",
]
[[package]]
name = "gemm"
version = "0.18.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ab96b703d31950f1aeddded248bc95543c9efc7ac9c4a21fda8703a83ee35451"
dependencies = [
"dyn-stack",
"gemm-c32 0.18.2",
"gemm-c64 0.18.2",
"gemm-common 0.18.2",
"gemm-f16 0.18.2",
"gemm-f32 0.18.2",
"gemm-f64 0.18.2",
"num-complex",
"num-traits",
"paste",
"raw-cpuid",
"seq-macro",
]
[[package]]
name = "gemm"
version = "0.19.0"
@@ -2791,12 +2886,27 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "aa0673db364b12263d103b68337a68fbecc541d6f6b61ba72fe438654709eacb"
dependencies = [
"dyn-stack",
"gemm-c32",
"gemm-c64",
"gemm-common",
"gemm-f16",
"gemm-f32",
"gemm-f64",
"gemm-c32 0.19.0",
"gemm-c64 0.19.0",
"gemm-common 0.19.0",
"gemm-f16 0.19.0",
"gemm-f32 0.19.0",
"gemm-f64 0.19.0",
"num-complex",
"num-traits",
"paste",
"raw-cpuid",
"seq-macro",
]
[[package]]
name = "gemm-c32"
version = "0.18.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f6db9fd9f40421d00eea9dd0770045a5603b8d684654816637732463f4073847"
dependencies = [
"dyn-stack",
"gemm-common 0.18.2",
"num-complex",
"num-traits",
"paste",
@@ -2811,7 +2921,22 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "086936dbdcb99e37aad81d320f98f670e53c1e55a98bee70573e83f95beb128c"
dependencies = [
"dyn-stack",
"gemm-common",
"gemm-common 0.19.0",
"num-complex",
"num-traits",
"paste",
"raw-cpuid",
"seq-macro",
]
[[package]]
name = "gemm-c64"
version = "0.18.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dfcad8a3d35a43758330b635d02edad980c1e143dc2f21e6fd25f9e4eada8edf"
dependencies = [
"dyn-stack",
"gemm-common 0.18.2",
"num-complex",
"num-traits",
"paste",
@@ -2826,7 +2951,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "20c8aeeeec425959bda4d9827664029ba1501a90a0d1e6228e48bef741db3a3f"
dependencies = [
"dyn-stack",
"gemm-common",
"gemm-common 0.19.0",
"num-complex",
"num-traits",
"paste",
@@ -2834,6 +2959,27 @@ dependencies = [
"seq-macro",
]
[[package]]
name = "gemm-common"
version = "0.18.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a352d4a69cbe938b9e2a9cb7a3a63b7e72f9349174a2752a558a8a563510d0f3"
dependencies = [
"bytemuck",
"dyn-stack",
"half",
"libm",
"num-complex",
"num-traits",
"once_cell",
"paste",
"pulp 0.21.5",
"raw-cpuid",
"rayon",
"seq-macro",
"sysctl",
]
[[package]]
name = "gemm-common"
version = "0.19.0"
@@ -2848,13 +2994,31 @@ dependencies = [
"num-traits",
"once_cell",
"paste",
"pulp",
"pulp 0.22.2",
"raw-cpuid",
"rayon",
"seq-macro",
"sysctl",
]
[[package]]
name = "gemm-f16"
version = "0.18.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cff95ae3259432f3c3410eaa919033cd03791d81cebd18018393dc147952e109"
dependencies = [
"dyn-stack",
"gemm-common 0.18.2",
"gemm-f32 0.18.2",
"half",
"num-complex",
"num-traits",
"paste",
"raw-cpuid",
"rayon",
"seq-macro",
]
[[package]]
name = "gemm-f16"
version = "0.19.0"
@@ -2862,8 +3026,8 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e3df7a55202e6cd6739d82ae3399c8e0c7e1402859b30e4cb780e61525d9486e"
dependencies = [
"dyn-stack",
"gemm-common",
"gemm-f32",
"gemm-common 0.19.0",
"gemm-f32 0.19.0",
"half",
"num-complex",
"num-traits",
@@ -2873,6 +3037,21 @@ dependencies = [
"seq-macro",
]
[[package]]
name = "gemm-f32"
version = "0.18.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bc8d3d4385393304f407392f754cd2dc4b315d05063f62cf09f47b58de276864"
dependencies = [
"dyn-stack",
"gemm-common 0.18.2",
"num-complex",
"num-traits",
"paste",
"raw-cpuid",
"seq-macro",
]
[[package]]
name = "gemm-f32"
version = "0.19.0"
@@ -2880,7 +3059,22 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "02e0b8c9da1fbec6e3e3ab2ce6bc259ef18eb5f6f0d3e4edf54b75f9fd41a81c"
dependencies = [
"dyn-stack",
"gemm-common",
"gemm-common 0.19.0",
"num-complex",
"num-traits",
"paste",
"raw-cpuid",
"seq-macro",
]
[[package]]
name = "gemm-f64"
version = "0.18.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "35b2a4f76ce4b8b16eadc11ccf2e083252d8237c1b589558a49b0183545015bd"
dependencies = [
"dyn-stack",
"gemm-common 0.18.2",
"num-complex",
"num-traits",
"paste",
@@ -2895,7 +3089,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "056131e8f2a521bfab322f804ccd652520c79700d81209e9d9275bbdecaadc6a"
dependencies = [
"dyn-stack",
"gemm-common",
"gemm-common 0.19.0",
"num-complex",
"num-traits",
"paste",
@@ -4067,7 +4261,7 @@ checksum = "4c6b649701667bbe825c3b7e6388cb521c23d88644678e83c0c4d0a621a34b43"
dependencies = [
"displaydoc",
"potential_utf",
"yoke",
"yoke 0.8.2",
"zerofrom",
"zerovec",
]
@@ -4134,7 +4328,7 @@ dependencies = [
"displaydoc",
"icu_locale_core",
"writeable",
"yoke",
"yoke 0.8.2",
"zerofrom",
"zerotrie",
"zerovec",
@@ -4530,7 +4724,7 @@ dependencies = [
[[package]]
name = "kebab-app"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"base64 0.22.1",
@@ -4577,7 +4771,7 @@ dependencies = [
[[package]]
name = "kebab-chunk"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"blake3",
@@ -4595,7 +4789,7 @@ dependencies = [
[[package]]
name = "kebab-cli"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"clap",
@@ -4616,7 +4810,7 @@ dependencies = [
[[package]]
name = "kebab-config"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"dirs 5.0.1",
@@ -4632,7 +4826,7 @@ dependencies = [
[[package]]
name = "kebab-core"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"blake3",
@@ -4646,7 +4840,7 @@ dependencies = [
[[package]]
name = "kebab-embed"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"blake3",
@@ -4660,7 +4854,7 @@ dependencies = [
[[package]]
name = "kebab-embed-candle"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"candle-core",
@@ -4679,7 +4873,7 @@ dependencies = [
[[package]]
name = "kebab-embed-local"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"fastembed",
@@ -4692,7 +4886,7 @@ dependencies = [
[[package]]
name = "kebab-eval"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"kebab-app",
@@ -4711,7 +4905,7 @@ dependencies = [
[[package]]
name = "kebab-llm"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"kebab-core",
@@ -4720,7 +4914,7 @@ dependencies = [
[[package]]
name = "kebab-llm-local"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"kebab-config",
@@ -4737,7 +4931,7 @@ dependencies = [
[[package]]
name = "kebab-mcp"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"kebab-app",
@@ -4755,7 +4949,7 @@ dependencies = [
[[package]]
name = "kebab-nli"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"hf-hub",
@@ -4770,7 +4964,7 @@ dependencies = [
[[package]]
name = "kebab-parse-code"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"gix",
@@ -4793,7 +4987,7 @@ dependencies = [
[[package]]
name = "kebab-parse-image"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"ab_glyph",
"anyhow",
@@ -4817,7 +5011,7 @@ dependencies = [
[[package]]
name = "kebab-parse-md"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"kebab-core",
@@ -4834,7 +5028,7 @@ dependencies = [
[[package]]
name = "kebab-parse-pdf"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"blake3",
@@ -4849,7 +5043,7 @@ dependencies = [
[[package]]
name = "kebab-rag"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"blake3",
@@ -4871,7 +5065,7 @@ dependencies = [
[[package]]
name = "kebab-search"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"globset",
@@ -4890,7 +5084,7 @@ dependencies = [
[[package]]
name = "kebab-source-fs"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"blake3",
@@ -4908,7 +5102,7 @@ dependencies = [
[[package]]
name = "kebab-store-sqlite"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"blake3",
@@ -4928,7 +5122,7 @@ dependencies = [
[[package]]
name = "kebab-store-vector"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"arrow",
@@ -4952,7 +5146,7 @@ dependencies = [
[[package]]
name = "kebab-tui"
version = "0.22.0"
version = "0.23.0"
dependencies = [
"anyhow",
"crossterm",
@@ -5626,6 +5820,16 @@ dependencies = [
"cc",
]
[[package]]
name = "libloading"
version = "0.8.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d7c4b02199fee7c5d21a5ae7d8cfa79a6ef5bb2fc834d6e9058e89c825efdc55"
dependencies = [
"cfg-if",
"windows-link",
]
[[package]]
name = "libm"
version = "0.2.16"
@@ -5942,6 +6146,15 @@ version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "670fdfda89751bc4a84ac13eaa63e205cf0fd22b4c9a5fbfa085b63c1f1d3a30"
[[package]]
name = "malloc_buf"
version = "0.0.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "62bb907fe88d54d8d9ce32a3cceab4218ed2f6b7d35617cafe9adf84e43919cb"
dependencies = [
"libc",
]
[[package]]
name = "maplit"
version = "1.0.2"
@@ -6038,6 +6251,21 @@ dependencies = [
"stable_deref_trait",
]
[[package]]
name = "metal"
version = "0.29.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7ecfd3296f8c56b7c1f6fbac3c71cefa9d78ce009850c45000015f206dc7fa21"
dependencies = [
"bitflags 2.11.1",
"block",
"core-graphics-types",
"foreign-types 0.5.0",
"log",
"objc",
"paste",
]
[[package]]
name = "mime"
version = "0.3.17"
@@ -6401,6 +6629,15 @@ version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "830b246a0e5f20af87141b25c173cd1b609bd7779a4617d6ec582abaf90870f3"
[[package]]
name = "objc"
version = "0.2.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "915b1b472bc21c53464d6c8461c9d3af805ba1ef837e1cac254428f4a77177b1"
dependencies = [
"malloc_buf",
]
[[package]]
name = "objc2"
version = "0.6.4"
@@ -6410,12 +6647,50 @@ dependencies = [
"objc2-encode",
]
[[package]]
name = "objc2-core-foundation"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2a180dd8642fa45cdb7dd721cd4c11b1cadd4929ce112ebd8b9f5803cc79d536"
dependencies = [
"bitflags 2.11.1",
"dispatch2",
"objc2",
]
[[package]]
name = "objc2-encode"
version = "4.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ef25abbcd74fb2609453eb695bd2f860d389e457f67dc17cafc8b8cbc89d0c33"
[[package]]
name = "objc2-foundation"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e3e0adef53c21f888deb4fa59fc59f7eb17404926ee8a6f59f5df0fd7f9f3272"
dependencies = [
"bitflags 2.11.1",
"block2",
"libc",
"objc2",
"objc2-core-foundation",
]
[[package]]
name = "objc2-metal"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a0125f776a10d00af4152d74616409f0d4a2053a6f57fa5b7d6aa2854ac04794"
dependencies = [
"bitflags 2.11.1",
"block2",
"dispatch2",
"objc2",
"objc2-core-foundation",
"objc2-foundation",
]
[[package]]
name = "object"
version = "0.37.3"
@@ -6497,7 +6772,7 @@ checksum = "f38c4372413cdaaf3cc79dd92d29d7d9f5ab09b51b10dded508fb90bb70b9222"
dependencies = [
"bitflags 2.11.1",
"cfg-if",
"foreign-types",
"foreign-types 0.3.2",
"libc",
"once_cell",
"openssl-macros",
@@ -7006,6 +7281,20 @@ dependencies = [
"unicase",
]
[[package]]
name = "pulp"
version = "0.21.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "96b86df24f0a7ddd5e4b95c94fc9ed8a98f1ca94d3b01bdce2824097e7835907"
dependencies = [
"bytemuck",
"cfg-if",
"libm",
"num-complex",
"reborrow",
"version_check",
]
[[package]]
name = "pulp"
version = "0.22.2"
@@ -7932,6 +8221,16 @@ version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dd29631678d6fb0903b69223673e122c32e9ae559d0960a38d574695ebc0ea15"
[[package]]
name = "safetensors"
version = "0.4.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "44560c11236a6130a46ce36c836a62936dc81ebf8c36a37947423571be0e55b6"
dependencies = [
"serde",
"serde_json",
]
[[package]]
name = "safetensors"
version = "0.7.0"
@@ -9470,6 +9769,41 @@ version = "1.20.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "40ce102ab67701b8526c123c1bab5cbe42d7040ccfd0f64af1a385808d2f43de"
[[package]]
name = "ug"
version = "0.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "76b761acf8af3494640d826a8609e2265e19778fb43306c7f15379c78c9b05b0"
dependencies = [
"gemm 0.18.2",
"half",
"libloading",
"memmap2",
"num",
"num-traits",
"num_cpus",
"rayon",
"safetensors 0.4.5",
"serde",
"thiserror 1.0.69",
"tracing",
"yoke 0.7.5",
]
[[package]]
name = "ug-metal"
version = "0.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9f7adf545a99a086d362efc739e7cf4317c18cbeda22706000fd434d70ea3d95"
dependencies = [
"half",
"metal",
"objc",
"serde",
"thiserror 1.0.69",
"ug",
]
[[package]]
name = "unarray"
version = "0.1.4"
@@ -10416,6 +10750,18 @@ version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7a5a4b21e1a62b67a2970e6831bc091d7b87e119e7f9791aef9702e3bef04448"
[[package]]
name = "yoke"
version = "0.7.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "120e6aef9aa629e3d4f52dc8cc43a015c7724194c97dfaf45180d2daf2b77f40"
dependencies = [
"serde",
"stable_deref_trait",
"yoke-derive 0.7.5",
"zerofrom",
]
[[package]]
name = "yoke"
version = "0.8.2"
@@ -10423,10 +10769,22 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "abe8c5fda708d9ca3df187cae8bfb9ceda00dd96231bed36e445a1a48e66f9ca"
dependencies = [
"stable_deref_trait",
"yoke-derive",
"yoke-derive 0.8.2",
"zerofrom",
]
[[package]]
name = "yoke-derive"
version = "0.7.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2380878cad4ac9aac1e2435f3eb4020e8374b5f13c296cb75b4620ff8e229154"
dependencies = [
"proc-macro2",
"quote",
"syn 2.0.117",
"synstructure",
]
[[package]]
name = "yoke-derive"
version = "0.8.2"
@@ -10493,7 +10851,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0f9152d31db0792fa83f70fb2f83148effb5c1f5b8c7686c3459e361d9bc20bf"
dependencies = [
"displaydoc",
"yoke",
"yoke 0.8.2",
"zerofrom",
]
@@ -10503,7 +10861,7 @@ version = "0.11.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "90f911cbc359ab6af17377d242225f4d75119aec87ea711a880987b18cd7b239"
dependencies = [
"yoke",
"yoke 0.8.2",
"zerofrom",
"zerovec-derive",
]

View File

@@ -31,7 +31,7 @@ edition = "2024"
rust-version = "1.85"
license = "MIT OR Apache-2.0"
repository = "https://github.com/altair823/kebab"
version = "0.22.0" # v0.22.0 — candle 임베딩 provider (NUMA-안전, opt-in `provider=candle` + `num_threads`/KEBAB_EMBED_THREADS). fastembed default 불변, embedding_version 유지(재색인 0). — CLAUDE.md §Release 도그푸딩 트리거
version = "0.23.0" # v0.23.0 — candle Metal(Apple Silicon GPU) opt-in build feature (`--features embed_metal`): M-series 맥에서 e5-large 임베딩을 GPU 로 → 대용량 ingest 가속. macOS 전용, 벡터는 CPU candle 과 호환. default(비-metal) 동작 불변. — CLAUDE.md §Release 도그푸딩 트리거
# pre-v0.18 workspace-wide cleanup: enable clippy::pedantic group with
# intentional allow-list. The allowed lints are either cosmetic (doc style),

View File

@@ -108,6 +108,25 @@ dimensions = 1024 # config 와 LanceDB stored dim 불일치 시
num_threads = 0 # candle 전용 CPU 스레드 캡 (0=auto=#cores).
# env KEBAB_EMBED_THREADS 가 우선. NUMA 노드 바인딩은
# numactl 과 조합. fastembed provider 는 무시.
```
**Apple Silicon GPU 가속 (candle / macOS)**: M-시리즈 맥에서 candle 임베딩을
GPU(Metal)로 돌리면 CPU 대비 대용량 ingest 가 크게 빨라진다. 빌드 또는 설치 시
`embed_metal` feature 를 켠다:
```bash
# 빌드만:
cargo build --release --features embed_metal
# 전역 설치 (~/.cargo/bin/kebab):
cargo install --path crates/kebab-cli --features embed_metal --locked
```
벡터는 CPU candle 과 동일 모델이라 호환되므로, 맥에서 GPU 로 색인한
`kebab.sqlite` + `lancedb/` 를 그대로 Linux 서버(CPU candle)로 복사해 질의할 수
있다. 색인 로그에 `candle device = Metal (GPU)` 가 보이면 GPU 사용 중. metal
feature 는 macOS 전용 (Linux/서버는 기본 CPU 빌드).
```toml
[models.llm]
endpoint = "http://localhost:11434" # Ollama host:port

View File

@@ -100,6 +100,8 @@ reqwest = { version = "0.12", default-features = false, features = ["blocki
# disable path 없음; 이 feature 는 spec §6.3 명시를 honor 하는 role 만.
default = ["fts_korean_morphological"]
fts_korean_morphological = []
# opt-in (macOS): candle embedder runs on the Apple Silicon GPU. See kebab-embed-candle.
embed_metal = ["kebab-embed-candle/metal"]
[lints]
workspace = true

View File

@@ -51,5 +51,10 @@ tempfile = { workspace = true }
rusqlite = { workspace = true }
time = { workspace = true }
[features]
# opt-in (macOS): build the `kebab` binary with candle on the Apple Silicon GPU.
# cargo build --release --features embed_metal
embed_metal = ["kebab-app/embed_metal"]
[lints]
workspace = true

View File

@@ -25,6 +25,14 @@ rayon = "1"
anyhow = { workspace = true }
tracing = { workspace = true }
[features]
# opt-in: run candle on the Apple Silicon GPU (Metal). macOS-only — the build
# enables candle's metal backend and `select_device()` picks Metal (CPU fallback
# on failure). Lets an M-series Mac ingest e5-large on GPU (10×+ vs CPU); the
# resulting vectors are cross-compatible with the CPU path (same model), so the
# Linux server can serve queries on CPU candle.
metal = ["candle-core/metal", "candle-nn/metal", "candle-transformers/metal"]
[dev-dependencies]
# Integration-test binaries can only see the library's public API + these,
# not the library's own (non-dev) dependencies — so rayon/kebab-config/kebab-core

View File

@@ -128,7 +128,7 @@ impl CandleEmbedder {
std::fs::create_dir_all(&cache_dir)
.with_context(|| format!("create candle cache dir {}", cache_dir.display()))?;
let device = Device::Cpu;
let device = select_device();
// 3. Fetch model files via hf-hub into the candle cache.
tracing::info!(
@@ -250,7 +250,9 @@ impl CandleEmbedder {
let norm = mean.sqr()?.sum_keepdim(1)?.sqrt()?;
let normalized = mean.broadcast_div(&norm)?;
Ok(normalized.to_vec2::<f32>()?)
// `.contiguous()` before host copy: broadcast ops can leave a strided
// view, which `to_vec2` rejects on the Metal backend (CPU tolerates it).
Ok(normalized.contiguous()?.to_vec2::<f32>()?)
}
}
@@ -307,6 +309,32 @@ fn prefix_input(input: &EmbeddingInput<'_>) -> String {
}
}
/// Select the compute device. Built with the `metal` feature (Apple Silicon
/// GPU), try Metal and fall back to CPU on failure; otherwise CPU. Metal only
/// compiles/runs on macOS — the Linux server builds the CPU path. e5-large
/// vectors are model-defined, so Metal-produced and CPU-produced embeddings are
/// cross-compatible (a Mac can ingest on GPU, the server query on CPU).
fn select_device() -> Device {
#[cfg(feature = "metal")]
{
match Device::new_metal(0) {
Ok(d) => {
tracing::info!(target: "kebab-embed-candle", "candle device = Metal (GPU)");
return d;
}
Err(e) => {
tracing::warn!(
target: "kebab-embed-candle",
error = %e,
"Metal device unavailable; falling back to CPU"
);
}
}
}
tracing::info!(target: "kebab-embed-candle", "candle device = CPU");
Device::Cpu
}
/// Apply a one-shot global rayon thread cap (the NUMA-safety lever). Returns
/// `true` if this call set the pool, `false` if it was already initialized
/// (cap not applied) or `n_threads == 0`. `#[doc(hidden)] pub` so the

View File

@@ -14,6 +14,31 @@ historical contract that was implemented; this file accumulates the
deltas so phase 5+ readers can find the live behavior without diffing
git history.
## 2026-06-02 — candle Metal(Apple Silicon GPU) opt-in build feature
**동기.** candle CPU 임베딩은 e5-large/512-tok 에서 ~1.5~1.9 s/chunk 로 느리고,
코어를 더 줘도(rayon/MKL) 안 빨라진다(병목=커널 효율). 대용량 코퍼스(수만 청크)는
CPU 로는 수 시간. 사용자 워크플로: **M4 Pro 맥에서 GPU 로 빠르게 색인 → sqlite +
lancedb 만 Linux NUMA 서버로 복사 → 서버는 CPU candle 로 질의** (벡터 동일 모델이라
호환, KB 이식성은 06-01 항목 + workspace_path 상대경로 + chunks.text 저장으로 확인).
**무엇.** `kebab-embed-candle``metal` feature 추가 →
`candle-core/-nn/-transformers` 의 metal 백엔드 활성. `select_device()` 가 metal
빌드 시 `Device::new_metal(0)` 선택(실패 시 CPU fallback), 비-metal 빌드는 기존
`Device::Cpu` 그대로. host 복사 전 `.contiguous()` 추가(Metal 의 strided view 가
`to_vec2` 거부 — CPU 는 허용). feature passthrough: `kebab-app/embed_metal`
`kebab-cli/embed_metal`. 빌드: `cargo build --release --features embed_metal`(macOS).
**제약 / 검증 분담.** metal 은 **macOS 전용 컴파일** — Linux CPU 머신(개발/서버)은
비-metal 경로만 빌드(검증: clippy 0 + candle 단위 6 + thread_cap + parity, exit 0).
**Metal 실행·속도·벡터 패리티(GPU vs CPU)는 M4 Pro 에서 사용자 검증** (Claude 의
Linux 환경에서 불가). 로그 `candle device = Metal (GPU)` 로 GPU 사용 확인.
**호환성.** default(비-metal) 동작·벡터 불변. wire/schema 변경 없음. 버전 0.22.0 →
**0.23.0** (신규 opt-in build feature surface).
amends: `docs/superpowers/specs/2026-06-01-embed-candle-track-spec.md` (§10 후속 — GPU 가속).
## 2026-06-01 — candle 임베딩 provider (NUMA double-free 회피, opt-in)
**무엇이 문제였나.** 듀얼소켓 NUMA 서버에서 `provider=fastembed`(onnxruntime)로