|
| 1 | +# Bug: Relative Import FQN Mismatch in Deep Attribute Chain Method Lookup |
| 2 | + |
| 3 | +**Component**: `sast-engine/graph/callgraph/resolution/attribute.go` + `extraction/attributes.go` |
| 4 | +**Function**: `resolveMethodOnType()` (line 170) — fails due to FQN constructed from attribute registry not matching callgraph key |
| 5 | +**Introduced**: Commit `e601234` (deep chain resolution) — chain walk works, final step fails on relative imports |
| 6 | +**Severity**: HIGH — blocks L1 type-inferred source matching for projects using relative imports |
| 7 | +**Reproduces on**: pyload (confirmed), any project using `from .submodule import Class` pattern |
| 8 | +**Does NOT reproduce on**: flat multi-file projects, projects using absolute imports |
| 9 | + |
| 10 | +--- |
| 11 | + |
| 12 | +## Summary |
| 13 | + |
| 14 | +After commit `e601234`, deep attribute chains like `self.pyload.config.get()` correctly resolve the intermediate types through the chain walk (Step 2). The chain `self → pyload → config` resolves to type `pyload.config.parser.ConfigParser`. However, the final method lookup (Step 3) fails because `resolveMethodOnType` constructs `pyload.config.parser.ConfigParser.get` and checks `callGraph.Functions`, which stores the key as `pyload.core.config.parser.ConfigParser.get`. |
| 15 | + |
| 16 | +The root cause is a **relative import FQN resolution mismatch**: |
| 17 | +- **Attribute registry** stores type as `pyload.config.parser.ConfigParser` (resolved from relative import `from .config.parser import ConfigParser` — the `.` is resolved relative to the importing module's parent, dropping the `core` segment) |
| 18 | +- **Callgraph** stores function as `pyload.core.config.parser.ConfigParser.get` (derived from the actual file path `pyload/core/config/parser.py`) |
| 19 | + |
| 20 | +This does NOT reproduce on flat projects or absolute imports because both systems agree on the FQN. It only triggers when the class is imported via Python relative imports (`from .x.y import Z`). |
| 21 | + |
| 22 | +--- |
| 23 | + |
| 24 | +## Reproduction |
| 25 | + |
| 26 | +**NOTE**: This bug does NOT reproduce on flat multi-file projects with absolute imports. |
| 27 | +It ONLY reproduces when relative imports are used (`from .submodule import Class`). |
| 28 | + |
| 29 | +### Scenario A: Flat project — WORKS (no bug) |
| 30 | + |
| 31 | +```python |
| 32 | +# /tmp/test-flat/config.py |
| 33 | +class Config: |
| 34 | + def __init__(self): |
| 35 | + self.data = {} |
| 36 | + def get(self, section, key): |
| 37 | + return self.data.get(section, {}).get(key, "") |
| 38 | + |
| 39 | +# /tmp/test-flat/core.py |
| 40 | +from config import Config |
| 41 | +class Core: |
| 42 | + def __init__(self): |
| 43 | + self.config = Config() |
| 44 | + |
| 45 | +# /tmp/test-flat/manager.py |
| 46 | +import subprocess |
| 47 | +from core import Core |
| 48 | +class Manager: |
| 49 | + def __init__(self): |
| 50 | + self.core = Core() |
| 51 | + def run_command(self): |
| 52 | + cmd = self.core.config.get("commands", "startup") |
| 53 | + subprocess.run(cmd) |
| 54 | +``` |
| 55 | + |
| 56 | +```bash |
| 57 | +pathfinder scan -r RULE -p /tmp/test-flat/ --skip-tests=false |
| 58 | +# Result: DETECTED — self.core.config.get() resolves to config.Config.get ✅ |
| 59 | +# Both attribute registry and callgraph use "config.Config.get" |
| 60 | +``` |
| 61 | + |
| 62 | +### Scenario B: Package with relative imports — FAILS (the bug) |
| 63 | + |
| 64 | +```python |
| 65 | +# /tmp/test-pkg/myapp/__init__.py |
| 66 | +(empty) |
| 67 | + |
| 68 | +# /tmp/test-pkg/myapp/config/parser.py |
| 69 | +class ConfigParser: |
| 70 | + def __init__(self, userdir): |
| 71 | + self.config = {} |
| 72 | + def get(self, section, option): |
| 73 | + return self.config[section][option]["value"] |
| 74 | + |
| 75 | +# /tmp/test-pkg/myapp/core/__init__.py |
| 76 | +from ..config.parser import ConfigParser # ← RELATIVE IMPORT |
| 77 | +class Core: |
| 78 | + def __init__(self, userdir): |
| 79 | + self.config = ConfigParser(userdir) |
| 80 | + |
| 81 | +# /tmp/test-pkg/myapp/managers/thread_manager.py |
| 82 | +import subprocess |
| 83 | +class ThreadManager: |
| 84 | + def __init__(self, core): |
| 85 | + self.core = core |
| 86 | + def reconnect(self): |
| 87 | + script = self.core.config.get("reconnect", "script") |
| 88 | + subprocess.run(script) |
| 89 | +``` |
| 90 | + |
| 91 | +```bash |
| 92 | +pathfinder scan -r RULE -p /tmp/test-pkg/ --skip-tests=false --debug |
| 93 | +# Result: 0 findings ❌ |
| 94 | +# Debug shows: |
| 95 | +# method get not found on type config.parser.ConfigParser |
| 96 | +# |
| 97 | +# Attribute registry FQN: config.parser.ConfigParser (from relative import resolution) |
| 98 | +# Callgraph function FQN: myapp.config.parser.ConfigParser.get (from file path) |
| 99 | +# MISMATCH: "config.parser.ConfigParser.get" != "myapp.config.parser.ConfigParser.get" |
| 100 | +``` |
| 101 | + |
| 102 | +### Scenario C: Real-world CVE (pyload) — CONFIRMED BUG |
| 103 | + |
| 104 | +```bash |
| 105 | +git clone --depth 1 https://github.com/pyload/pyload.git /tmp/pyload |
| 106 | +pathfinder scan -r RULE -p /tmp/pyload/src/ --skip-tests=false --debug |
| 107 | +``` |
| 108 | + |
| 109 | +The relevant pyload code: |
| 110 | + |
| 111 | +```python |
| 112 | +# pyload/core/__init__.py:117 |
| 113 | +from .config.parser import ConfigParser # ← RELATIVE IMPORT (dot prefix) |
| 114 | + |
| 115 | +# pyload/core/__init__.py:124 |
| 116 | +self.config = ConfigParser(self.userdir) |
| 117 | + |
| 118 | +# pyload/core/managers/thread_manager.py:176 |
| 119 | +reconnect_script = self.pyload.config.get("reconnect", "script") |
| 120 | +# ↑ 3-level chain: self → pyload → config → get() |
| 121 | + |
| 122 | +# pyload/core/managers/thread_manager.py:199 |
| 123 | +subprocess.run(reconnect_script) # ← CVE GHSA-r7mc-x6x7-cqxx |
| 124 | +``` |
| 125 | + |
| 126 | +Debug output confirms: |
| 127 | +``` |
| 128 | +Deep chains (3+ levels): 0 (0.0%) ← chains ARE resolving (was 756 before fix) |
| 129 | +
|
| 130 | +Custom class samples: |
| 131 | + - method get not found on type pyload.config.parser.ConfigParser |
| 132 | + - method set not found on type pyload.config.parser.ConfigParser |
| 133 | + - method save not found on type pyload.config.parser.ConfigParser |
| 134 | +``` |
| 135 | + |
| 136 | +**The mismatch**: |
| 137 | +- Attribute registry resolves `self.config = ConfigParser(...)` where `ConfigParser` was imported via `from .config.parser import ConfigParser` |
| 138 | +- The `.config.parser` relative import resolves to `pyload.config.parser` (parent of `core/__init__.py` is `pyload/`, then append `config.parser`) |
| 139 | +- But the actual file is at `pyload/core/config/parser.py`, so the callgraph stores it as `pyload.core.config.parser.ConfigParser.get` |
| 140 | +- **Missing `core` segment**: `pyload.config.parser` vs `pyload.core.config.parser` |
| 141 | + |
| 142 | +--- |
| 143 | + |
| 144 | +## Root Cause Analysis |
| 145 | + |
| 146 | +### The failing code path |
| 147 | + |
| 148 | +**File**: `sast-engine/graph/callgraph/resolution/attribute.go`, lines 195-209 |
| 149 | + |
| 150 | +```go |
| 151 | +func resolveMethodOnType( |
| 152 | + typeFQN string, // "pyload.config.parser.ConfigParser" |
| 153 | + methodName string, // "get" |
| 154 | + ... |
| 155 | +) (string, bool, *core.TypeInfo) { |
| 156 | + // Lines 177-193: Builtin check — skipped (not builtins.*) |
| 157 | + |
| 158 | + // Line 196: Construct method FQN |
| 159 | + methodFQN := typeFQN + "." + methodName |
| 160 | + // → "pyload.config.parser.ConfigParser.get" |
| 161 | + |
| 162 | + // Lines 198-209: Check callgraph |
| 163 | + if callGraph != nil { |
| 164 | + if node := callGraph.Functions[methodFQN]; node != nil { |
| 165 | + // → node is NIL — method not found |
| 166 | + return methodFQN, true, &core.TypeInfo{...} |
| 167 | + } |
| 168 | + } |
| 169 | + |
| 170 | + // Line 213: Falls through to failure |
| 171 | + attributeFailureStats.CustomClassUnsupported++ |
| 172 | + // → "method get not found on type pyload.config.parser.ConfigParser" |
| 173 | + return "", false, nil |
| 174 | +} |
| 175 | +``` |
| 176 | + |
| 177 | +### The FQN mismatch (CONFIRMED) |
| 178 | + |
| 179 | +Verified on pyload's codebase: |
| 180 | + |
| 181 | +| System | FQN for ConfigParser | Source | |
| 182 | +|--------|---------------------|--------| |
| 183 | +| **Attribute registry** | `pyload.config.parser.ConfigParser` | Resolved from relative import `from .config.parser import ConfigParser` in `pyload/core/__init__.py` | |
| 184 | +| **Callgraph** | `pyload.core.config.parser.ConfigParser` | Derived from file path `pyload/core/config/parser.py` | |
| 185 | + |
| 186 | +The relative import `from .config.parser import ConfigParser`: |
| 187 | +- Is in file `pyload/core/__init__.py` |
| 188 | +- The `.` resolves relative to the **parent package** of `core/`, which is `pyload/` |
| 189 | +- So `.config.parser` → `pyload.config.parser` (going up from `core/` to `pyload/`, then down to `config/parser`) |
| 190 | +- But the actual file is `pyload/core/config/parser.py` → FQN should be `pyload.core.config.parser` |
| 191 | +- **The `core` segment is lost during relative import resolution** |
| 192 | + |
| 193 | +This is a bug in how the attribute extractor resolves relative import paths to FQNs. It resolves `.config.parser` as `{parent_package}.config.parser` but the parent package is computed incorrectly — it goes one level too high. |
| 194 | + |
| 195 | +### How to diagnose |
| 196 | + |
| 197 | +Add temporary debug logging to `resolveMethodOnType` to print: |
| 198 | +1. The constructed `methodFQN` |
| 199 | +2. All keys in `callGraph.Functions` that contain the class name |
| 200 | + |
| 201 | +```go |
| 202 | +// Temporary debug — add before line 198 |
| 203 | +if strings.Contains(typeFQN, "Config") { |
| 204 | + log.Printf("[DEBUG] Looking for method %q on type %q", methodName, typeFQN) |
| 205 | + log.Printf("[DEBUG] Constructed FQN: %q", methodFQN) |
| 206 | + for key := range callGraph.Functions { |
| 207 | + if strings.Contains(key, "ConfigParser") && strings.Contains(key, "get") { |
| 208 | + log.Printf("[DEBUG] Callgraph has: %q", key) |
| 209 | + } |
| 210 | + } |
| 211 | +} |
| 212 | +``` |
| 213 | + |
| 214 | +This will immediately reveal the FQN format stored in the callgraph vs what's constructed. |
| 215 | + |
| 216 | +--- |
| 217 | + |
| 218 | +## Likely Fixes |
| 219 | + |
| 220 | +### Fix A (RECOMMENDED): Fix relative import FQN resolution in attribute extractor |
| 221 | + |
| 222 | +The root cause is in how `resolveClassNameForChain()` or the attribute extraction resolves `from .config.parser import ConfigParser`. The `.` prefix should resolve relative to the **current package** (`pyload.core`), producing `pyload.core.config.parser.ConfigParser`. Instead, it resolves relative to the **parent package** (`pyload`), producing `pyload.config.parser.ConfigParser`. |
| 223 | + |
| 224 | +The fix should align relative import resolution with Python's actual semantics: |
| 225 | +- `from .x import Y` in `pyload/core/__init__.py` → `pyload.core.x.Y` (same package) |
| 226 | +- `from ..x import Y` in `pyload/core/__init__.py` → `pyload.x.Y` (parent package) |
| 227 | + |
| 228 | +**Where to look**: |
| 229 | +- `extraction/attributes.go` — where `class:ConfigParser` placeholder is created |
| 230 | +- `resolution/attribute.go:resolveClassNameForChain()` — where placeholder is resolved to FQN |
| 231 | +- The ImportMap construction — how `from .config.parser import ConfigParser` is recorded |
| 232 | + |
| 233 | +**Pros**: Correct fix at the source, all downstream consumers get correct FQNs |
| 234 | +**Cons**: Need to understand how ImportMap handles relative imports |
| 235 | + |
| 236 | +### Fix B: Suffix-based fallback in resolveMethodOnType |
| 237 | + |
| 238 | +If the exact `callGraph.Functions[methodFQN]` lookup fails, try suffix matching: |
| 239 | + |
| 240 | +```go |
| 241 | +// After exact lookup fails (line 209), try suffix matching |
| 242 | +if callGraph != nil { |
| 243 | + // Extract "ConfigParser.get" from "pyload.config.parser.ConfigParser.get" |
| 244 | + suffix := extractClassAndMethod(typeFQN, methodName) |
| 245 | + for fqn, node := range callGraph.Functions { |
| 246 | + if strings.HasSuffix(fqn, "."+suffix) && |
| 247 | + (node.Type == "method" || node.Type == "function_definition") { |
| 248 | + return fqn, true, &core.TypeInfo{ |
| 249 | + TypeFQN: typeFQN, |
| 250 | + Confidence: float32(attrConfidence * 0.85), // slight confidence penalty |
| 251 | + Source: "self_attribute_custom_class_fuzzy", |
| 252 | + } |
| 253 | + } |
| 254 | + } |
| 255 | +} |
| 256 | +``` |
| 257 | + |
| 258 | +**Pros**: Quick fix, handles any FQN mismatch without fixing root cause |
| 259 | +**Cons**: O(n) scan of callgraph on every miss, could match wrong class if names collide (e.g., `app.Config.get` vs `lib.Config.get`). Better as a temporary workaround. |
| 260 | + |
| 261 | +### Fix C: Build a class→method index keyed by short class name |
| 262 | + |
| 263 | +During callgraph construction, build a reverse index: |
| 264 | +```go |
| 265 | +type ClassMethodIndex map[string]map[string]string |
| 266 | +// shortClassName → methodName → full function FQN |
| 267 | +// "ConfigParser" → "get" → "pyload.core.config.parser.ConfigParser.get" |
| 268 | + |
| 269 | +// During callgraph build: |
| 270 | +for fqn, node := range callGraph.Functions { |
| 271 | + if node.Type == "method" { |
| 272 | + className := extractShortClassName(fqn) // "ConfigParser" |
| 273 | + methodName := extractMethodName(fqn) // "get" |
| 274 | + index[className][methodName] = fqn |
| 275 | + } |
| 276 | +} |
| 277 | +``` |
| 278 | + |
| 279 | +Then `resolveMethodOnType` extracts the short class name from the type FQN and uses this index. |
| 280 | + |
| 281 | +**Pros**: O(1) lookup, tolerates any FQN format difference |
| 282 | +**Cons**: Ambiguous if two classes share the same short name (need disambiguation by module proximity) |
| 283 | + |
| 284 | +--- |
| 285 | + |
| 286 | +## Verification |
| 287 | + |
| 288 | +### Test 1: Simple project (above) |
| 289 | +```bash |
| 290 | +pathfinder scan -r /tmp/test-rule -p /tmp/test-project/ --skip-tests=false |
| 291 | +# Expected after fix: 1 finding at manager.py:12 |
| 292 | +``` |
| 293 | + |
| 294 | +### Test 2: Pyload CVE (GHSA-r7mc-x6x7-cqxx) |
| 295 | +```bash |
| 296 | +# L1 rule with ConfigParserType.method("get") as source |
| 297 | +pathfinder scan -r /tmp/test-rule-pyload -p /tmp/pyload/src/ --skip-tests=false |
| 298 | +# Expected after fix: 1 finding at thread_manager.py:199 |
| 299 | +# Currently: 0 findings (method lookup fails) |
| 300 | +``` |
| 301 | + |
| 302 | +### Test 3: Existing tests must pass |
| 303 | +```bash |
| 304 | +cd sast-engine && go test ./... |
| 305 | +# All existing tests must still pass |
| 306 | +``` |
| 307 | + |
| 308 | +### Test 4: Verify no regressions on builtin types |
| 309 | +```bash |
| 310 | +# self.data.get() where data is builtins.dict should still resolve |
| 311 | +# self.session.execute() where session is sqlite3.Connection should still resolve |
| 312 | +``` |
| 313 | + |
| 314 | +--- |
| 315 | + |
| 316 | +## Impact |
| 317 | + |
| 318 | +This is the **last blocker for L1 type-inferred source matching** on project-internal classes. The chain walk (Step 2) is fixed. The method lookup (Step 3) is the remaining issue. |
| 319 | + |
| 320 | +| What | Status | |
| 321 | +|------|--------| |
| 322 | +| Deep chain walk (self.a.b.c) | Fixed in `e601234` | |
| 323 | +| Intermediate type resolution | Working (resolves to correct class FQN) | |
| 324 | +| Method lookup on builtins | Working (dict.get, list.append etc.) | |
| 325 | +| Method lookup on stdlib | Working (via CDN registry) | |
| 326 | +| **Method lookup on project classes** | **BROKEN** (FQN mismatch) | |
| 327 | + |
| 328 | +Once fixed, the following CVE rules would achieve **L1** (both source and sink type-inferred): |
| 329 | + |
| 330 | +| Rule | Source | Sink | Current | After Fix | |
| 331 | +|------|--------|------|---------|-----------| |
| 332 | +| SEC-111 | `ConfigParser.method("get")` | `subprocess.method("run")` | L2 | **L1** | |
| 333 | +| SEC-138 | Graph query sources | `Neo4jDriver.method("run")` | L4 | **L2+** | |
| 334 | +| Any rule with project-internal class sources | — | — | Blocked | **Unblocked** | |
| 335 | + |
| 336 | +--- |
| 337 | + |
| 338 | +## Files to Inspect |
| 339 | + |
| 340 | +| File | What to check | |
| 341 | +|------|--------------| |
| 342 | +| `resolution/attribute.go:196-209` | `resolveMethodOnType` — the failing lookup | |
| 343 | +| `extraction/attributes.go` | How `ClassFQN` is set in the attribute registry | |
| 344 | +| `builder/builder.go` | How function FQNs are set in `callGraph.Functions` | |
| 345 | +| `core/callgraph.go` | `Functions` map key format | |
| 346 | + |
| 347 | +The fix is likely small — it's a string format mismatch between two subsystems that were built independently and now need to agree on FQN conventions. |
0 commit comments