Skip to content

Commit 687f650

Browse files
shivasuryaclaude
andcommitted
feat: add 10 CVE-derived security rules from advisory research
Add 10 new Python security rules derived from analyzing 53 real CVEs across the GitHub Advisory Database. Each rule includes full meta.yaml, positive/negative test cases, and CVE references. New rules: - SEC-110: Unsafe tarfile.extractall() path traversal (CVE-2007-4559, GHSA-fhff) - SEC-113: HTTP_HOST header used for access control (GHSA-q485) - SEC-123: Jinja2 SSTI via from_string/Template (GHSA-pxrr) - SEC-133: RSA PKCS1v15 deprecated padding (GHSA-7432, Bleichenbacher) - SEC-136: Dynamic module import from user input (GHSA-cwxj, CWE-470) - SEC-138: Cypher/graph query injection (GHSA-gg5m, CWE-943) - SEC-139: Unsafe msgpack deserialization (GHSA-g48c, CWE-502) - SEC-144: CORS wildcard origin with credentials (GHSA-9jfm, CWE-942) - SEC-157: ZipFile extract path traversal / zip-slip (GHSA-8rrh) - SEC-162: Symlink following arbitrary file access (GHSA-g925, CWE-59) All rules verified: positive tests detect, negative tests produce 0 FP. SEC-138 tightened from broad calls("*.run") to Neo4jModule.method() + targeted py2neo patterns after verification found excessive FP risk. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 4dffef2 commit 687f650

41 files changed

Lines changed: 3127 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 347 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,347 @@
1+
# Bug: Relative Import FQN Mismatch in Deep Attribute Chain Method Lookup
2+
3+
**Component**: `sast-engine/graph/callgraph/resolution/attribute.go` + `extraction/attributes.go`
4+
**Function**: `resolveMethodOnType()` (line 170) — fails due to FQN constructed from attribute registry not matching callgraph key
5+
**Introduced**: Commit `e601234` (deep chain resolution) — chain walk works, final step fails on relative imports
6+
**Severity**: HIGH — blocks L1 type-inferred source matching for projects using relative imports
7+
**Reproduces on**: pyload (confirmed), any project using `from .submodule import Class` pattern
8+
**Does NOT reproduce on**: flat multi-file projects, projects using absolute imports
9+
10+
---
11+
12+
## Summary
13+
14+
After commit `e601234`, deep attribute chains like `self.pyload.config.get()` correctly resolve the intermediate types through the chain walk (Step 2). The chain `self → pyload → config` resolves to type `pyload.config.parser.ConfigParser`. However, the final method lookup (Step 3) fails because `resolveMethodOnType` constructs `pyload.config.parser.ConfigParser.get` and checks `callGraph.Functions`, which stores the key as `pyload.core.config.parser.ConfigParser.get`.
15+
16+
The root cause is a **relative import FQN resolution mismatch**:
17+
- **Attribute registry** stores type as `pyload.config.parser.ConfigParser` (resolved from relative import `from .config.parser import ConfigParser` — the `.` is resolved relative to the importing module's parent, dropping the `core` segment)
18+
- **Callgraph** stores function as `pyload.core.config.parser.ConfigParser.get` (derived from the actual file path `pyload/core/config/parser.py`)
19+
20+
This does NOT reproduce on flat projects or absolute imports because both systems agree on the FQN. It only triggers when the class is imported via Python relative imports (`from .x.y import Z`).
21+
22+
---
23+
24+
## Reproduction
25+
26+
**NOTE**: This bug does NOT reproduce on flat multi-file projects with absolute imports.
27+
It ONLY reproduces when relative imports are used (`from .submodule import Class`).
28+
29+
### Scenario A: Flat project — WORKS (no bug)
30+
31+
```python
32+
# /tmp/test-flat/config.py
33+
class Config:
34+
def __init__(self):
35+
self.data = {}
36+
def get(self, section, key):
37+
return self.data.get(section, {}).get(key, "")
38+
39+
# /tmp/test-flat/core.py
40+
from config import Config
41+
class Core:
42+
def __init__(self):
43+
self.config = Config()
44+
45+
# /tmp/test-flat/manager.py
46+
import subprocess
47+
from core import Core
48+
class Manager:
49+
def __init__(self):
50+
self.core = Core()
51+
def run_command(self):
52+
cmd = self.core.config.get("commands", "startup")
53+
subprocess.run(cmd)
54+
```
55+
56+
```bash
57+
pathfinder scan -r RULE -p /tmp/test-flat/ --skip-tests=false
58+
# Result: DETECTED — self.core.config.get() resolves to config.Config.get ✅
59+
# Both attribute registry and callgraph use "config.Config.get"
60+
```
61+
62+
### Scenario B: Package with relative imports — FAILS (the bug)
63+
64+
```python
65+
# /tmp/test-pkg/myapp/__init__.py
66+
(empty)
67+
68+
# /tmp/test-pkg/myapp/config/parser.py
69+
class ConfigParser:
70+
def __init__(self, userdir):
71+
self.config = {}
72+
def get(self, section, option):
73+
return self.config[section][option]["value"]
74+
75+
# /tmp/test-pkg/myapp/core/__init__.py
76+
from ..config.parser import ConfigParser # ← RELATIVE IMPORT
77+
class Core:
78+
def __init__(self, userdir):
79+
self.config = ConfigParser(userdir)
80+
81+
# /tmp/test-pkg/myapp/managers/thread_manager.py
82+
import subprocess
83+
class ThreadManager:
84+
def __init__(self, core):
85+
self.core = core
86+
def reconnect(self):
87+
script = self.core.config.get("reconnect", "script")
88+
subprocess.run(script)
89+
```
90+
91+
```bash
92+
pathfinder scan -r RULE -p /tmp/test-pkg/ --skip-tests=false --debug
93+
# Result: 0 findings ❌
94+
# Debug shows:
95+
# method get not found on type config.parser.ConfigParser
96+
#
97+
# Attribute registry FQN: config.parser.ConfigParser (from relative import resolution)
98+
# Callgraph function FQN: myapp.config.parser.ConfigParser.get (from file path)
99+
# MISMATCH: "config.parser.ConfigParser.get" != "myapp.config.parser.ConfigParser.get"
100+
```
101+
102+
### Scenario C: Real-world CVE (pyload) — CONFIRMED BUG
103+
104+
```bash
105+
git clone --depth 1 https://github.com/pyload/pyload.git /tmp/pyload
106+
pathfinder scan -r RULE -p /tmp/pyload/src/ --skip-tests=false --debug
107+
```
108+
109+
The relevant pyload code:
110+
111+
```python
112+
# pyload/core/__init__.py:117
113+
from .config.parser import ConfigParser # ← RELATIVE IMPORT (dot prefix)
114+
115+
# pyload/core/__init__.py:124
116+
self.config = ConfigParser(self.userdir)
117+
118+
# pyload/core/managers/thread_manager.py:176
119+
reconnect_script = self.pyload.config.get("reconnect", "script")
120+
# ↑ 3-level chain: self → pyload → config → get()
121+
122+
# pyload/core/managers/thread_manager.py:199
123+
subprocess.run(reconnect_script) # ← CVE GHSA-r7mc-x6x7-cqxx
124+
```
125+
126+
Debug output confirms:
127+
```
128+
Deep chains (3+ levels): 0 (0.0%) ← chains ARE resolving (was 756 before fix)
129+
130+
Custom class samples:
131+
- method get not found on type pyload.config.parser.ConfigParser
132+
- method set not found on type pyload.config.parser.ConfigParser
133+
- method save not found on type pyload.config.parser.ConfigParser
134+
```
135+
136+
**The mismatch**:
137+
- Attribute registry resolves `self.config = ConfigParser(...)` where `ConfigParser` was imported via `from .config.parser import ConfigParser`
138+
- The `.config.parser` relative import resolves to `pyload.config.parser` (parent of `core/__init__.py` is `pyload/`, then append `config.parser`)
139+
- But the actual file is at `pyload/core/config/parser.py`, so the callgraph stores it as `pyload.core.config.parser.ConfigParser.get`
140+
- **Missing `core` segment**: `pyload.config.parser` vs `pyload.core.config.parser`
141+
142+
---
143+
144+
## Root Cause Analysis
145+
146+
### The failing code path
147+
148+
**File**: `sast-engine/graph/callgraph/resolution/attribute.go`, lines 195-209
149+
150+
```go
151+
func resolveMethodOnType(
152+
typeFQN string, // "pyload.config.parser.ConfigParser"
153+
methodName string, // "get"
154+
...
155+
) (string, bool, *core.TypeInfo) {
156+
// Lines 177-193: Builtin check — skipped (not builtins.*)
157+
158+
// Line 196: Construct method FQN
159+
methodFQN := typeFQN + "." + methodName
160+
// → "pyload.config.parser.ConfigParser.get"
161+
162+
// Lines 198-209: Check callgraph
163+
if callGraph != nil {
164+
if node := callGraph.Functions[methodFQN]; node != nil {
165+
// → node is NIL — method not found
166+
return methodFQN, true, &core.TypeInfo{...}
167+
}
168+
}
169+
170+
// Line 213: Falls through to failure
171+
attributeFailureStats.CustomClassUnsupported++
172+
// → "method get not found on type pyload.config.parser.ConfigParser"
173+
return "", false, nil
174+
}
175+
```
176+
177+
### The FQN mismatch (CONFIRMED)
178+
179+
Verified on pyload's codebase:
180+
181+
| System | FQN for ConfigParser | Source |
182+
|--------|---------------------|--------|
183+
| **Attribute registry** | `pyload.config.parser.ConfigParser` | Resolved from relative import `from .config.parser import ConfigParser` in `pyload/core/__init__.py` |
184+
| **Callgraph** | `pyload.core.config.parser.ConfigParser` | Derived from file path `pyload/core/config/parser.py` |
185+
186+
The relative import `from .config.parser import ConfigParser`:
187+
- Is in file `pyload/core/__init__.py`
188+
- The `.` resolves relative to the **parent package** of `core/`, which is `pyload/`
189+
- So `.config.parser``pyload.config.parser` (going up from `core/` to `pyload/`, then down to `config/parser`)
190+
- But the actual file is `pyload/core/config/parser.py` → FQN should be `pyload.core.config.parser`
191+
- **The `core` segment is lost during relative import resolution**
192+
193+
This is a bug in how the attribute extractor resolves relative import paths to FQNs. It resolves `.config.parser` as `{parent_package}.config.parser` but the parent package is computed incorrectly — it goes one level too high.
194+
195+
### How to diagnose
196+
197+
Add temporary debug logging to `resolveMethodOnType` to print:
198+
1. The constructed `methodFQN`
199+
2. All keys in `callGraph.Functions` that contain the class name
200+
201+
```go
202+
// Temporary debug — add before line 198
203+
if strings.Contains(typeFQN, "Config") {
204+
log.Printf("[DEBUG] Looking for method %q on type %q", methodName, typeFQN)
205+
log.Printf("[DEBUG] Constructed FQN: %q", methodFQN)
206+
for key := range callGraph.Functions {
207+
if strings.Contains(key, "ConfigParser") && strings.Contains(key, "get") {
208+
log.Printf("[DEBUG] Callgraph has: %q", key)
209+
}
210+
}
211+
}
212+
```
213+
214+
This will immediately reveal the FQN format stored in the callgraph vs what's constructed.
215+
216+
---
217+
218+
## Likely Fixes
219+
220+
### Fix A (RECOMMENDED): Fix relative import FQN resolution in attribute extractor
221+
222+
The root cause is in how `resolveClassNameForChain()` or the attribute extraction resolves `from .config.parser import ConfigParser`. The `.` prefix should resolve relative to the **current package** (`pyload.core`), producing `pyload.core.config.parser.ConfigParser`. Instead, it resolves relative to the **parent package** (`pyload`), producing `pyload.config.parser.ConfigParser`.
223+
224+
The fix should align relative import resolution with Python's actual semantics:
225+
- `from .x import Y` in `pyload/core/__init__.py``pyload.core.x.Y` (same package)
226+
- `from ..x import Y` in `pyload/core/__init__.py``pyload.x.Y` (parent package)
227+
228+
**Where to look**:
229+
- `extraction/attributes.go` — where `class:ConfigParser` placeholder is created
230+
- `resolution/attribute.go:resolveClassNameForChain()` — where placeholder is resolved to FQN
231+
- The ImportMap construction — how `from .config.parser import ConfigParser` is recorded
232+
233+
**Pros**: Correct fix at the source, all downstream consumers get correct FQNs
234+
**Cons**: Need to understand how ImportMap handles relative imports
235+
236+
### Fix B: Suffix-based fallback in resolveMethodOnType
237+
238+
If the exact `callGraph.Functions[methodFQN]` lookup fails, try suffix matching:
239+
240+
```go
241+
// After exact lookup fails (line 209), try suffix matching
242+
if callGraph != nil {
243+
// Extract "ConfigParser.get" from "pyload.config.parser.ConfigParser.get"
244+
suffix := extractClassAndMethod(typeFQN, methodName)
245+
for fqn, node := range callGraph.Functions {
246+
if strings.HasSuffix(fqn, "."+suffix) &&
247+
(node.Type == "method" || node.Type == "function_definition") {
248+
return fqn, true, &core.TypeInfo{
249+
TypeFQN: typeFQN,
250+
Confidence: float32(attrConfidence * 0.85), // slight confidence penalty
251+
Source: "self_attribute_custom_class_fuzzy",
252+
}
253+
}
254+
}
255+
}
256+
```
257+
258+
**Pros**: Quick fix, handles any FQN mismatch without fixing root cause
259+
**Cons**: O(n) scan of callgraph on every miss, could match wrong class if names collide (e.g., `app.Config.get` vs `lib.Config.get`). Better as a temporary workaround.
260+
261+
### Fix C: Build a class→method index keyed by short class name
262+
263+
During callgraph construction, build a reverse index:
264+
```go
265+
type ClassMethodIndex map[string]map[string]string
266+
// shortClassName → methodName → full function FQN
267+
// "ConfigParser" → "get" → "pyload.core.config.parser.ConfigParser.get"
268+
269+
// During callgraph build:
270+
for fqn, node := range callGraph.Functions {
271+
if node.Type == "method" {
272+
className := extractShortClassName(fqn) // "ConfigParser"
273+
methodName := extractMethodName(fqn) // "get"
274+
index[className][methodName] = fqn
275+
}
276+
}
277+
```
278+
279+
Then `resolveMethodOnType` extracts the short class name from the type FQN and uses this index.
280+
281+
**Pros**: O(1) lookup, tolerates any FQN format difference
282+
**Cons**: Ambiguous if two classes share the same short name (need disambiguation by module proximity)
283+
284+
---
285+
286+
## Verification
287+
288+
### Test 1: Simple project (above)
289+
```bash
290+
pathfinder scan -r /tmp/test-rule -p /tmp/test-project/ --skip-tests=false
291+
# Expected after fix: 1 finding at manager.py:12
292+
```
293+
294+
### Test 2: Pyload CVE (GHSA-r7mc-x6x7-cqxx)
295+
```bash
296+
# L1 rule with ConfigParserType.method("get") as source
297+
pathfinder scan -r /tmp/test-rule-pyload -p /tmp/pyload/src/ --skip-tests=false
298+
# Expected after fix: 1 finding at thread_manager.py:199
299+
# Currently: 0 findings (method lookup fails)
300+
```
301+
302+
### Test 3: Existing tests must pass
303+
```bash
304+
cd sast-engine && go test ./...
305+
# All existing tests must still pass
306+
```
307+
308+
### Test 4: Verify no regressions on builtin types
309+
```bash
310+
# self.data.get() where data is builtins.dict should still resolve
311+
# self.session.execute() where session is sqlite3.Connection should still resolve
312+
```
313+
314+
---
315+
316+
## Impact
317+
318+
This is the **last blocker for L1 type-inferred source matching** on project-internal classes. The chain walk (Step 2) is fixed. The method lookup (Step 3) is the remaining issue.
319+
320+
| What | Status |
321+
|------|--------|
322+
| Deep chain walk (self.a.b.c) | Fixed in `e601234` |
323+
| Intermediate type resolution | Working (resolves to correct class FQN) |
324+
| Method lookup on builtins | Working (dict.get, list.append etc.) |
325+
| Method lookup on stdlib | Working (via CDN registry) |
326+
| **Method lookup on project classes** | **BROKEN** (FQN mismatch) |
327+
328+
Once fixed, the following CVE rules would achieve **L1** (both source and sink type-inferred):
329+
330+
| Rule | Source | Sink | Current | After Fix |
331+
|------|--------|------|---------|-----------|
332+
| SEC-111 | `ConfigParser.method("get")` | `subprocess.method("run")` | L2 | **L1** |
333+
| SEC-138 | Graph query sources | `Neo4jDriver.method("run")` | L4 | **L2+** |
334+
| Any rule with project-internal class sources ||| Blocked | **Unblocked** |
335+
336+
---
337+
338+
## Files to Inspect
339+
340+
| File | What to check |
341+
|------|--------------|
342+
| `resolution/attribute.go:196-209` | `resolveMethodOnType` — the failing lookup |
343+
| `extraction/attributes.go` | How `ClassFQN` is set in the attribute registry |
344+
| `builder/builder.go` | How function FQNs are set in `callGraph.Functions` |
345+
| `core/callgraph.go` | `Functions` map key format |
346+
347+
The fix is likely small — it's a string format mismatch between two subsystems that were built independently and now need to agree on FQN conventions.

0 commit comments

Comments
 (0)