config.workspace.include was completely ignored by the walker — connector.rs
log_scope_include_warning literally said "handled by extractor router" but
no extractor router exists. Dogfooding (PR #142 1B + multi-root corpus
kebab-docs + httpx + zod + lodash) showed user-set include of code+md still
ingested 84 .png + 8 .pdf files.
Fix: walker treats scope.include as an allow-list — empty Vec preserves
backward-compat (all files pass), non-empty requires file path to match at
least one pattern (AND with the existing exclude rules). Removed the
misleading debug log.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wire kebab_parse_code::is_generated_file and is_oversized into
FsSourceConnector::scan_with_skips. Files that pass gitignore/builtin/
kebabignore matching are now checked for generated-file markers
(config-gated via ingest.code.skip_generated_header) and byte/line caps
(ingest.code.max_file_bytes / max_file_lines). FsScanSkips gains
skipped_generated + skipped_size_exceeded counters; kebab-app threads
them into IngestReport. Also fixes a pre-existing clippy::derivable_impls
warning in IngestCfg. Three new connector tests cover all three paths.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Refactor walker to expose WalkOverrides (combined + per-source matchers),
add walk_files_with_skips that returns accepted files alongside skip
attribution, wire FsSourceConnector::scan_with_skips into kebab-app so
IngestReport.skipped_gitignore, skipped_kebabignore, skipped_builtin_blacklist,
and skip_examples are populated instead of left at zero. Priority order
per spec §5.2 (builtin > gitignore > kebabignore) enforced in classify_skip,
with a directory-aware builtin matcher so pruned directory entries are
correctly attributed to builtin rather than a coincident gitignore entry.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Prevent double-`!` corruption when a `.gitignore` negation pattern
(e.g. `!keep/`) hits the trailing-slash normalizer in `read_gitignore`.
Also updates module-level and `build_overrides` doc to list all five
filter sources in application order, and adds a regression test.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds read_gitignore() (pub(crate), root-only, nested cascade deferred)
and merges its patterns as a 5th group in build_overrides(). Trailing-
slash patterns (dist/) are normalized to also emit a stem/** glob so
files inside the directory are matched when is_dir=false. Two new tests
cover both the happy path and the missing-file no-op.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wires `kebab_parse_code::BUILTIN_BLACKLIST` (6 patterns: node_modules,
target, __pycache__, .venv, venv, env) into `build_overrides()` so the
walker automatically excludes these directories even when the user has
no `.kebabignore`. TDD cycle: 2 failing tests added first, then the
pattern-add loop inserted after the existing kbignore block.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
도그푸딩 item 3 — `workspace.root` 의 허용 형식이 명문화 안 돼 사용자가
\"상대 경로면 어디 기준?\" 가 불명확. 이제 절대/tilde/env/상대 모두
지원하되, 상대 경로의 base 는 **config.toml 자체가 위치한 디렉토리**
(사용자의 cwd 와 무관) 로 일관 정책.
## 핵심 변경
- **`kebab_config::expand_path_with_base(raw, data_dir, base_dir)`**
신규. 기존 `expand_path` (tilde + env 만) 위에 relative-path
resolution 추가:
- tilde / 절대 / `${VAR}` 입력은 base_dir 무시 (이미 absolute)
- relative 입력만 `base_dir.join(...)` 로 절대화
- **`Config.source_dir: Option<PathBuf>`** 신규 (`#[serde(skip)]`).
`Config::from_file` / `load` 가 `path.parent()` 로 stamp. defaults
는 None (cwd fallback).
- **`Config::resolve_workspace_root()`** helper: source_dir 있으면
그것 기준, 없으면 cwd 기준.
- **callsite 정리**:
- `kebab-app::lib.rs` 의 3 군데 `expand_tilde(&app.config.workspace
.root)` → `app.config.resolve_workspace_root()`
- `kebab-app::init_workspace` 도 동일
- `kebab-source-fs::FsSourceConnector::new` → 동일
- kebab-source-fs 의 fork 된 local `expand_tilde` + `dirs_home`
헬퍼 제거 (kebab-config 가 canonical)
- **`kebab init`** 가 생성하는 `config.toml` 위에 path policy 안내
헤더 코멘트 prepend (절대/tilde/env/상대 + 상대 base = config dir).
기존 `expand_tilde` 가 kebab-app/lib.rs 에 한 군데 (storage.data_dir)
남음 — spec out-of-scope (\"expand_tilde 통일 P+\") 라 보류.
## 테스트
- `expand_path_with_base` 에 신규 4 unit (relative→base, absolute
ignores base, tilde ignores base, ${XDG} ignores base)
- 기존 27 kebab-config tests + workspace 전체 (`cargo test --workspace
--no-fail-fast -j 1` exit 0) 모두 통과
- `cargo clippy --workspace --all-targets -- -D warnings` clean
## 문서
- README Configuration 절: workspace.root 형식 + relative base 규칙
한 줄 추가
- HANDOFF: 2026-05-03 entry
- spec status planned → in_progress
## 영향
기존 사용자: 영향 없음 (defaults 의 `~/KnowledgeBase` 는 tilde-rooted,
relative path 분기 안 탐). 새 사용자가 `--config /tmp/cfg.toml` +
`root = "kb"` 같이 쓰면 cwd 무관하게 `/tmp/kb` 가 워크스페이스가 됨 —
이전엔 이 케이스가 cwd 기준이라 invisible foot-gun.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>