Skip to content

Commit 75a4b10

Browse files
shivasuryaclaude
andauthored
feat: deep attribute chain resolution (3+ levels) for self.attr patterns (#617)
* feat: support deep attribute chain resolution (3+ levels) for self.attr patterns Replace the hard 2-level limit in ResolveSelfAttributeCall with an iterative chain walker that resolves self.obj.attr.method() patterns up to 6 levels deep. This unblocks type-inferred source resolution for ~30% of self-attribute calls that were previously rejected (e.g., self.pyload.config.get in pyload CVE). Key changes: - Iterative chain walk through AttributeRegistry at each level - Cycle detection via visited set to prevent infinite loops - Inline class: placeholder resolution during chain walking - Extracted resolveMethodOnType helper for builtin + custom class dispatch - Updated builder.go parent-class fallback to handle deeper chains Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: suffix fallback in resolveMethodOnType for relative import FQN mismatch When Python relative imports (from .submodule import Class) are used, the attribute registry and callgraph can disagree on the full module path prefix. For example, the attribute registry may store "config.parser.ConfigParser" while the callgraph stores "myapp.config.parser.ConfigParser". Add a ClassName.method suffix fallback in resolveMethodOnType that fires after exact FQN lookup fails. This bridges the gap without modifying the relative import resolution subsystem. Slight confidence penalty (0.85x) signals the fuzzy match. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * test: 100% coverage for deep chain resolution and suffix fallback helpers Add tests for uncovered paths: - attr.Type nil during chain walk - extractClassMethodSuffix with bare class name (no dots) - resolveClassNameForChain: nil typeEngine, callgraph lookup, registry lookup, not found - isCallableNode for all node types including nil - suffix fallback with no-dot typeFQN (bare class name) All changed/added functions now at 100% coverage. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat: resolve deep chain terminal methods via stdlib/thirdparty registry When the terminal type in a deep attribute chain is a stdlib or third-party type (e.g., sqlite3.Connection, flask.Flask), resolveMethodOnType now consults the CDN-backed stdlib and third-party registries before giving up. This handles patterns like self.db.conn.execute() where conn is sqlite3.Connection — the chain walker resolves the intermediate types through the attribute registry, and the final method lookup now falls through to the stdlib registry when the callgraph doesn't have the method. Note: end-to-end resolution still requires the attribute extractor to resolve stdlib function return types into the attribute registry (e.g., sqlite3.connect() → sqlite3.Connection). This is a separate gap in the variable binding → attribute type pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
1 parent 975305a commit 75a4b10

3 files changed

Lines changed: 1124 additions & 59 deletions

File tree

sast-engine/graph/callgraph/builder/builder.go

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -406,7 +406,7 @@ func BuildCallGraph(codeGraph *graph.CodeGraph, registry *core.ModuleRegistry, p
406406
}
407407

408408
// Resolve the call target to a fully qualified name
409-
targetFQN, resolved, typeInfo := resolveCallTarget(callSite.Target, importMap, registry, job.modulePath, codeGraph, typeEngine, callerFQN, callGraph, logger)
409+
targetFQN, resolved, typeInfo := resolveCallTarget(callSite.Target, importMap, registry, job.modulePath, codeGraph, typeEngine, callerFQN, callGraph, logger)
410410

411411
// Update call site with resolution information
412412
callSite.TargetFQN = targetFQN
@@ -968,9 +968,9 @@ func resolveCallTarget(target string, importMap *core.ImportMap, registry *core.
968968
// For self.attr.method where attr isn't in child class, try parent classes
969969
if typeEngine != nil && typeEngine.ThirdPartyRemote != nil && codeGraph != nil {
970970
attrParts := strings.Split(target, ".")
971-
if len(attrParts) == 3 {
971+
if len(attrParts) >= 3 {
972972
attrName := attrParts[1]
973-
methodOnAttr := attrParts[2]
973+
methodOnAttr := attrParts[len(attrParts)-1]
974974
callerParts := strings.Split(callerFQN, ".")
975975
if len(callerParts) >= 3 {
976976
callerClassName := callerParts[len(callerParts)-2]

sast-engine/graph/callgraph/resolution/attribute.go

Lines changed: 250 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -32,25 +32,26 @@ var attributeFailureStats = &FailureStats{
3232
CustomClassSamples: make([]string, 0, 20),
3333
}
3434

35-
// ResolveSelfAttributeCall resolves self.attribute.method() patterns
36-
// This is the core of Phase 3 Task 12 - using extracted attributes to resolve calls.
35+
// maxChainDepth limits the number of intermediate attributes in a chain walk.
36+
// Real-world Python rarely exceeds 4 levels (self.app.db.session.execute).
37+
// This prevents pathological chains from causing excessive work.
38+
const maxChainDepth = 6
39+
40+
// ResolveSelfAttributeCall resolves self.attribute.method() patterns with
41+
// support for arbitrary chain depth (e.g., self.obj.attr.method()).
3742
//
3843
// Algorithm:
3944
// 1. Detect pattern: target starts with "self." and has 2+ dots
40-
// 2. Parse: self.attr.method → attr="attr", method="method"
45+
// 2. Parse: self.attr₁.attr₂...attrN.method → chain=[attr₁..attrN], method
4146
// 3. Find containing class from callerFQN
42-
// 4. Lookup attribute type in AttributeRegistry
43-
// 5. Resolve method on inferred type
47+
// 4. Walk the chain: for each attribute, look up its type and advance
48+
// 5. Resolve the final method on the terminal type
4449
//
45-
// Example:
50+
// Examples:
4651
//
47-
// Input: self.value.upper (caller: test_chaining.StringBuilder.process)
48-
// Steps:
49-
// 1. Parse → attr="value", method="upper"
50-
// 2. Extract class → test_chaining.StringBuilder
51-
// 3. Lookup value type → builtins.str
52-
// 4. Resolve upper on str → builtins.str.upper
53-
// Output: (builtins.str.upper, true, TypeInfo{builtins.str, 1.0, "self_attribute"})
52+
// 2-level: self.value.upper → chain=["value"], method="upper"
53+
// 3-level: self.core.config.get → chain=["core","config"], method="get"
54+
// 4-level: self.app.db.session.execute → chain=["app","db","session"], method="execute"
5455
//
5556
// Parameters:
5657
// - target: call target string (e.g., "self.value.upper")
@@ -84,57 +85,107 @@ func ResolveSelfAttributeCall(
8485
return "", false, nil
8586
}
8687

87-
// Parse the pattern: self.attr.method or self.attr.subattr.method
88+
// Parse the pattern: self.attr₁[.attr₂...].method
8889
parts := strings.Split(target, ".")
8990
if len(parts) < 3 {
9091
return "", false, nil
9192
}
9293

93-
// For now, handle simple case: self.attr.method (2 levels)
94-
// TODO: Handle deep chains like self.obj.attr.method
95-
if len(parts) > 3 {
94+
// Extract attribute chain and final method name.
95+
// parts[0] = "self", parts[1..n-1] = attribute chain, parts[n] = method
96+
attrChain := parts[1 : len(parts)-1] // e.g., ["core", "config"]
97+
methodName := parts[len(parts)-1] // e.g., "get"
98+
99+
// Enforce depth limit to prevent pathological chains
100+
if len(attrChain) > maxChainDepth {
96101
attributeFailureStats.DeepChains++
97102
if len(attributeFailureStats.DeepChainSamples) < 20 {
98103
attributeFailureStats.DeepChainSamples = append(attributeFailureStats.DeepChainSamples, target)
99104
}
100105
return "", false, nil
101106
}
102107

103-
attrName := parts[1]
104-
methodName := parts[2]
105-
106108
// Step 1: Find the containing class by checking which classes have this method
107109
classFQN := findClassContainingMethod(callerFQN, typeEngine.Attributes)
108110
if classFQN == "" {
109111
attributeFailureStats.ClassNotFound++
110112
return "", false, nil
111113
}
112114

113-
// Step 2: Lookup attribute in AttributeRegistry
114-
attr := typeEngine.Attributes.GetAttribute(classFQN, attrName)
115-
if attr == nil {
116-
attributeFailureStats.AttributeNotFound++
117-
if len(attributeFailureStats.AttributeNotFoundSamples) < 20 {
118-
attributeFailureStats.AttributeNotFoundSamples = append(
119-
attributeFailureStats.AttributeNotFoundSamples,
120-
fmt.Sprintf("%s (in class %s)", target, classFQN))
115+
// Step 2: Walk the attribute chain iteratively.
116+
// Start from the containing class and resolve each attribute's type.
117+
currentTypeFQN := classFQN
118+
var lastAttrConfidence float64
119+
visited := make(map[string]bool) // Cycle detection
120+
121+
for _, attrName := range attrChain {
122+
// Cycle detection: if we've seen this type before, stop
123+
if visited[currentTypeFQN] {
124+
attributeFailureStats.AttributeNotFound++
125+
if len(attributeFailureStats.AttributeNotFoundSamples) < 20 {
126+
attributeFailureStats.AttributeNotFoundSamples = append(
127+
attributeFailureStats.AttributeNotFoundSamples,
128+
fmt.Sprintf("%s (circular ref at type %s)", target, currentTypeFQN))
129+
}
130+
return "", false, nil
131+
}
132+
visited[currentTypeFQN] = true
133+
134+
attr := typeEngine.Attributes.GetAttribute(currentTypeFQN, attrName)
135+
if attr == nil {
136+
attributeFailureStats.AttributeNotFound++
137+
if len(attributeFailureStats.AttributeNotFoundSamples) < 20 {
138+
attributeFailureStats.AttributeNotFoundSamples = append(
139+
attributeFailureStats.AttributeNotFoundSamples,
140+
fmt.Sprintf("%s (attr %q not found in %s)", target, attrName, currentTypeFQN))
141+
}
142+
return "", false, nil
143+
}
144+
145+
if attr.Type == nil {
146+
attributeFailureStats.AttributeNotFound++
147+
return "", false, nil
148+
}
149+
150+
lastAttrConfidence = attr.Confidence
151+
currentTypeFQN = attr.Type.TypeFQN
152+
153+
// Resolve placeholder types like "class:Config" inline
154+
if strings.HasPrefix(currentTypeFQN, "class:") {
155+
className := strings.TrimPrefix(currentTypeFQN, "class:")
156+
resolved := resolveClassNameForChain(className, classFQN, typeEngine, callGraph)
157+
if resolved != "" {
158+
currentTypeFQN = resolved
159+
}
160+
// If unresolved, continue with the placeholder — it may still match
121161
}
122-
return "", false, nil
123162
}
124163

125-
// Step 3: Resolve method on the attribute's type
126-
attributeTypeFQN := attr.Type.TypeFQN
164+
// Step 3: Resolve the final method on the terminal type
165+
return resolveMethodOnType(currentTypeFQN, methodName, lastAttrConfidence, builtins, callGraph, typeEngine)
166+
}
127167

168+
// resolveMethodOnType resolves a method call on a given type FQN.
169+
// Checks builtin registry first, then custom class methods in the call graph,
170+
// then stdlib/third-party registries for known external types.
171+
func resolveMethodOnType(
172+
typeFQN string,
173+
methodName string,
174+
attrConfidence float64,
175+
builtins *registry.BuiltinRegistry,
176+
callGraph *core.CallGraph,
177+
typeEngine *TypeInferenceEngine,
178+
) (string, bool, *core.TypeInfo) {
128179
// Check if it's a builtin type
129-
if strings.HasPrefix(attributeTypeFQN, "builtins.") {
130-
methodFQN := attributeTypeFQN + "." + methodName
180+
if strings.HasPrefix(typeFQN, "builtins.") {
181+
methodFQN := typeFQN + "." + methodName
131182

132183
// Verify method exists in builtin registry
133-
method := builtins.GetMethod(attributeTypeFQN, methodName)
184+
method := builtins.GetMethod(typeFQN, methodName)
134185
if method != nil && method.ReturnType != nil {
135186
return methodFQN, true, &core.TypeInfo{
136187
TypeFQN: method.ReturnType.TypeFQN,
137-
Confidence: float32(attr.Confidence), // Inherit attribute confidence
188+
Confidence: float32(attrConfidence),
138189
Source: "self_attribute",
139190
}
140191
}
@@ -144,36 +195,188 @@ func ResolveSelfAttributeCall(
144195
}
145196

146197
// Handle custom class types (user-defined classes).
147-
// The attribute type is already resolved (e.g., "module.Controller")
148-
// from variable extraction. Now we need to resolve the method call on that type.
149-
methodFQN := attributeTypeFQN + "." + methodName
198+
methodFQN := typeFQN + "." + methodName
150199

151-
// Check if method exists in CallGraph.Functions map.
152200
if callGraph != nil {
201+
// Exact lookup first
153202
if node := callGraph.Functions[methodFQN]; node != nil {
154-
// Verify it's actually a callable (method, function, constructor, etc.).
155-
if node.Type == "method" || node.Type == "function_definition" ||
156-
node.Type == "constructor" || node.Type == "property" ||
157-
node.Type == "special_method" {
203+
if isCallableNode(node) {
158204
return methodFQN, true, &core.TypeInfo{
159-
TypeFQN: attributeTypeFQN,
160-
Confidence: float32(attr.Confidence),
205+
TypeFQN: typeFQN,
206+
Confidence: float32(attrConfidence),
161207
Source: "self_attribute_custom_class",
162208
}
163209
}
164210
}
211+
212+
// Suffix fallback: handles FQN mismatches from relative imports.
213+
// The attribute registry may store "config.parser.ConfigParser" while the
214+
// callgraph stores "myapp.config.parser.ConfigParser" (with full module prefix).
215+
// We match on "ClassName.method" suffix to bridge this gap.
216+
suffix := extractClassMethodSuffix(typeFQN, methodName)
217+
if suffix != "" {
218+
for fqn, node := range callGraph.Functions {
219+
if strings.HasSuffix(fqn, "."+suffix) && isCallableNode(node) {
220+
// Use the callgraph's FQN (the authoritative one)
221+
resolvedTypeFQN := strings.TrimSuffix(fqn, "."+methodName)
222+
return fqn, true, &core.TypeInfo{
223+
TypeFQN: resolvedTypeFQN,
224+
Confidence: float32(attrConfidence * 0.85), // slight penalty for fuzzy match
225+
Source: "self_attribute_custom_class",
226+
}
227+
}
228+
}
229+
}
230+
}
231+
232+
// Check stdlib/third-party registry for external types (e.g., sqlite3.Connection.execute).
233+
if fqn, resolved, typeInfo := resolveMethodViaStdlibRegistry(typeFQN, methodName, attrConfidence, typeEngine); resolved {
234+
return fqn, true, typeInfo
165235
}
166236

167-
// Method not found in call graph - collect stats and return unresolved.
237+
// Method not found collect stats
168238
attributeFailureStats.CustomClassUnsupported++
169239
if len(attributeFailureStats.CustomClassSamples) < 20 {
170240
attributeFailureStats.CustomClassSamples = append(
171241
attributeFailureStats.CustomClassSamples,
172-
fmt.Sprintf("%s (type: %s, method not found: %s)", target, attributeTypeFQN, methodFQN))
242+
fmt.Sprintf("method %s not found on type %s", methodName, typeFQN))
243+
}
244+
return "", false, nil
245+
}
246+
247+
// resolveMethodViaStdlibRegistry checks the stdlib and third-party registries
248+
// for a method on an external type (e.g., sqlite3.Connection.execute).
249+
// The typeFQN is split into module + class name, then looked up via GetClassMethod.
250+
func resolveMethodViaStdlibRegistry(
251+
typeFQN string,
252+
methodName string,
253+
attrConfidence float64,
254+
typeEngine *TypeInferenceEngine,
255+
) (string, bool, *core.TypeInfo) {
256+
if typeEngine == nil {
257+
return "", false, nil
258+
}
259+
260+
// Split typeFQN into module and class.
261+
// e.g., "sqlite3.Connection" → module="sqlite3", class="Connection"
262+
// e.g., "http.client.HTTPConnection" → try "http.client" + "HTTPConnection", then "http" + "client"
263+
parts := strings.Split(typeFQN, ".")
264+
if len(parts) < 2 {
265+
return "", false, nil
266+
}
267+
268+
// Try splitting at each dot position, from rightmost to leftmost.
269+
// This handles nested modules like "http.client.HTTPConnection".
270+
for i := len(parts) - 1; i >= 1; i-- {
271+
moduleName := strings.Join(parts[:i], ".")
272+
className := parts[i]
273+
274+
// Only proceed if the remaining parts after className are empty
275+
// (i.e., className is the last segment).
276+
if i != len(parts)-1 {
277+
continue
278+
}
279+
280+
// Check stdlib registry
281+
if typeEngine.StdlibRemote != nil {
282+
if stdlibLoader, ok := typeEngine.StdlibRemote.(*registry.StdlibRegistryRemote); ok {
283+
if stdlibLoader.HasModule(moduleName) {
284+
method := stdlibLoader.GetClassMethod(moduleName, className, methodName, nil)
285+
if method != nil {
286+
methodFQN := typeFQN + "." + methodName
287+
returnType := ""
288+
if method.ReturnType != "" && method.ReturnType != "unknown" {
289+
returnType = method.ReturnType
290+
}
291+
return methodFQN, true, &core.TypeInfo{
292+
TypeFQN: returnType,
293+
Confidence: float32(attrConfidence) * method.Confidence * 0.9,
294+
Source: "self_attribute_stdlib",
295+
}
296+
}
297+
}
298+
}
299+
}
300+
301+
// Check third-party registry
302+
if typeEngine.ThirdPartyRemote != nil {
303+
if tpLoader, ok := typeEngine.ThirdPartyRemote.(*registry.ThirdPartyRegistryRemote); ok {
304+
if tpLoader.HasModule(moduleName) {
305+
method := tpLoader.GetClassMethod(moduleName, className, methodName, nil)
306+
if method != nil {
307+
methodFQN := typeFQN + "." + methodName
308+
returnType := ""
309+
if method.ReturnType != "" && method.ReturnType != "unknown" {
310+
returnType = method.ReturnType
311+
}
312+
return methodFQN, true, &core.TypeInfo{
313+
TypeFQN: returnType,
314+
Confidence: float32(attrConfidence) * method.Confidence * 0.9,
315+
Source: "self_attribute_thirdparty",
316+
}
317+
}
318+
}
319+
}
320+
}
173321
}
322+
174323
return "", false, nil
175324
}
176325

326+
// isCallableNode checks if a graph node represents a callable symbol.
327+
func isCallableNode(node *graph.Node) bool {
328+
return node != nil && (node.Type == "method" || node.Type == "function_definition" ||
329+
node.Type == "constructor" || node.Type == "property" ||
330+
node.Type == "special_method")
331+
}
332+
333+
// extractClassMethodSuffix extracts "ClassName.method" from a full type FQN.
334+
// e.g., "config.parser.ConfigParser" + "get" → "ConfigParser.get".
335+
func extractClassMethodSuffix(typeFQN, methodName string) string {
336+
lastDot := strings.LastIndex(typeFQN, ".")
337+
if lastDot == -1 {
338+
// typeFQN is just a class name (no module prefix)
339+
return typeFQN + "." + methodName
340+
}
341+
className := typeFQN[lastDot+1:]
342+
return className + "." + methodName
343+
}
344+
345+
// resolveClassNameForChain resolves a "class:ClassName" placeholder during chain walking.
346+
// Uses ImportMap, same-module lookup, and module registry (same as ResolveAttributePlaceholders).
347+
func resolveClassNameForChain(
348+
className string,
349+
contextClassFQN string,
350+
typeEngine *TypeInferenceEngine,
351+
callGraph *core.CallGraph,
352+
) string {
353+
if typeEngine == nil {
354+
return ""
355+
}
356+
357+
// Try resolving via the existing resolveClassName (uses ImportMap, same-module, short names)
358+
modulePath := getModuleFromClassFQN(contextClassFQN)
359+
candidateFQN := modulePath + "." + className
360+
361+
// Check call graph for the class — if any function key starts with candidateFQN+".",
362+
// the class exists in the codebase.
363+
if callGraph != nil {
364+
prefix := candidateFQN + "."
365+
for fqn := range callGraph.Functions {
366+
if strings.HasPrefix(fqn, prefix) {
367+
return candidateFQN
368+
}
369+
}
370+
}
371+
372+
// Check attribute registry — if the class has registered attributes, it exists
373+
if typeEngine.Attributes != nil && typeEngine.Attributes.HasClass(candidateFQN) {
374+
return candidateFQN
375+
}
376+
377+
return ""
378+
}
379+
177380
// PrintAttributeFailureStats prints detailed statistics about attribute chain failures.
178381
// Only prints if debug mode is enabled via the provided logger.
179382
func PrintAttributeFailureStats(logger interface{ IsDebug() bool }) {

0 commit comments

Comments
 (0)