Walk config.workspace.root, apply gitignore-style filters
(config.workspace.exclude ∪ .kbignore ∪ baked-in defaults for
.DS_Store / ._*), stream BLAKE3 over each file, and emit a
deterministic Vec<RawAsset> sorted by workspace_path.
Modules:
- hash: streaming blake3::Hasher + 64 KiB read buffer (no whole-file
loads); pinned digests for empty input and "hello world".
- media: extension → MediaType (markdown/pdf/image/audio/other).
- walker: ignore::OverrideBuilder for filter union; walkdir with
manual visited-set cycle protection on top of follow_links.
- connector: public FsSourceConnector::new(&Config) +
SourceConnector::scan(&SourceScope) impl. Uses
kb_core::to_posix for WorkspacePath construction (carries
P0-1 # rejection through unchanged) and kb_core::id_for_asset
for AssetId derivation. Storage variant signals intent only;
actual byte copy is P1-6's responsibility.
Per design §3.3, §6.2, §6.6, §7.1, §7.2, §8.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
24 lines
721 B
TOML
24 lines
721 B
TOML
[package]
|
|
name = "kb-source-fs"
|
|
version = { workspace = true }
|
|
edition = { workspace = true }
|
|
rust-version = { workspace = true }
|
|
license = { workspace = true }
|
|
repository = { workspace = true }
|
|
description = "Local filesystem SourceConnector — walks workspace.root + applies gitignore filters"
|
|
|
|
[dependencies]
|
|
kb-core = { path = "../kb-core" }
|
|
kb-config = { path = "../kb-config" }
|
|
anyhow = { workspace = true }
|
|
serde = { workspace = true }
|
|
time = { workspace = true }
|
|
blake3 = { workspace = true }
|
|
tracing = { workspace = true }
|
|
walkdir = "2"
|
|
ignore = "0.4"
|
|
|
|
[dev-dependencies]
|
|
serde_json = { workspace = true }
|
|
tempfile = "3"
|