Token-efficient code analysis for LLMs.
Modern codebases are massive. Even when a model's context window is large enough, dumping raw source buries signal under noise. Distil extracts structure instead of text, reducing context by ~95% while preserving what matters for accurate reasoning.
Instead of feeding raw source files into an LLM context, Distil produces structured analysis at five layers of depth. Each layer adds more detail, so you request only what the task needs:
Raw source (10,000 tokens)
|
v
L1: AST (500 tokens) "What functions exist?"
L2: Calls (800 tokens) "Who calls what?"
L3: CFG (200 tokens) "How complex is this function?"
L4: DFG (300 tokens) "Where does this value flow?"
L5: Slice (150 tokens) "What affects line 42?"
# From source (pnpm monorepo)
git clone https://github.com/joshuaboys/distil.git
cd distil
pnpm install && pnpm build
# Link globally
pnpm -F @distil/cli link --global# Get the lay of the land
distil tree .
# Understand a specific file's structure
distil extract src/auth.ts# Who calls this function? What might break?
distil impact validateToken .
# What data flows through it?
distil dfg src/auth.ts validateToken# Build the full call graph to see dependencies
distil calls .
# Check complexity before deciding what to simplify
distil cfg src/auth.ts validateToken# What code affects line 42? (backward slice)
distil slice src/auth.ts validateToken 42
# What does line 42 affect? (forward slice)
distil slice src/auth.ts validateToken 42 --forward| Command | Layer | Description |
|---|---|---|
distil tree [path] |
- | File tree structure |
distil extract <file> |
L1 | Functions, classes, imports, signatures |
distil calls [path] |
L2 | Build project call graph |
distil impact <func> [path] |
L2 | Find all callers of a function |
distil cfg <file> <func> |
L3 | Control flow graph with complexity |
distil dfg <file> <func> |
L4 | Data flow graph with def-use chains |
distil slice <file> <func> <line> |
L5 | Program slice (backward/forward) |
All commands support --json for programmatic use. Function names use fuzzy matching.
L1-L3 (AST, Call Graph, CFG) are structurally exact — they reflect the parse tree and control flow as written.
L4 (DFG) uses a conservative reaching-definitions approximation. For each variable use, Distil connects it to the most recent definition by source line number rather than performing full control-flow-aware reaching-definitions. This means:
- Definitions in mutually exclusive branches (e.g. if/else) may not be distinguished
- Loop-carried dependencies use the nearest prior definition heuristic
- Multiple reaching definitions are marked
isMayReach: true
This approximation can introduce both false positives (spurious def-use edges) and false negatives (missing valid edges), especially in the presence of complex control flow such as branching and loops.
L5 (PDG/Slicing) inherits L4's approximation — slices may include some statements that are not strictly relevant and may miss some that are, and are intended as a practical aid rather than a fully sound program analysis.
| Language | L1 | L2 | L3-L5 |
|---|---|---|---|
| TypeScript/JavaScript | yes | yes | yes |
| Python | planned | - | - |
| Rust | planned | - | - |
packages/
distil-core # Analysis engine (tree-sitter parsers, L1-L5 extractors)
distil-cli # Command-line interface (Commander.js)
distil-mcp # MCP server for editor/agent integration
Distil CLI / MCP Server
|
v
Distil Analysis Engine
L1 -> L2 -> L3 -> L4 -> L5
|
v
tree-sitter
(language-specific grammars)
Distil includes an MCP server for editor and agent integration. Start it with:
distil mcpOr add to your editor's MCP settings:
{
"mcpServers": {
"distil": {
"command": "distil",
"args": ["mcp"]
}
}
}Available MCP tools:
| Tool | Description |
|---|---|
distil_extract |
L1: Extract file structure (functions, classes) |
distil_calls |
L2: Build project call graph |
distil_impact |
L2: Find all callers of a function |
distil_cfg |
L3: Control flow graph with complexity metrics |
distil_dfg |
L4: Data flow graph with def-use chains |
distil_slice |
L5: Program slice (backward/forward) |
Workflow prompts: distil_before_editing, distil_debug_line, distil_refactor_impact
Planned features:
- Semantic search -- natural language code search via embeddings
- Index warming -- pre-build all analysis layers for fast queries
- Monorepo support -- per-package analysis with cross-package call graphs
Roadmap details and module specs are in plans/ using APS format. Start at plans/index.aps.md.
pnpm install # Install dependencies
pnpm build # Build all packages
pnpm test # Run all tests
pnpm typecheck # Type checkApache 2.0