Skip to content

do not cache global functions during initialization anymore#2741

Merged
jurgenvinju merged 3 commits intomainfrom
fix/no-cache-during-loading
Apr 1, 2026
Merged

do not cache global functions during initialization anymore#2741
jurgenvinju merged 3 commits intomainfrom
fix/no-cache-during-loading

Conversation

@jurgenvinju
Copy link
Copy Markdown
Member

@jurgenvinju jurgenvinju commented Mar 31, 2026

This looks ok, but:

This PR partially fixes this situation:

module X

default int f(int _) = 42;
int x = f(0);
int f(0) = 0;

test bool testF() = f(0) == 0 && x == 0;

Because at the time of running the initialization for x the second overload is not there yet, the answer to f(0) will be 42. While the test is running, both x and f(0) will return the wrong number.

It is the goal of this PR to fix the value of f(0). The fix for x in the test is in #2740

@jurgenvinju jurgenvinju marked this pull request as draft March 31, 2026 12:34
@jurgenvinju jurgenvinju requested a review from DavyLandman March 31, 2026 12:35
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 90.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 46%. Comparing base (d11f490) to head (071ab6d).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
...g/rascalmpl/interpreter/env/ModuleEnvironment.java 90% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##              main   #2741   +/-   ##
=======================================
- Coverage       46%     46%   -1%     
- Complexity    6720    6725    +5     
=======================================
  Files          794     794           
  Lines        65910   65912    +2     
  Branches      9887    9888    +1     
=======================================
- Hits         30826   30822    -4     
  Misses       32699   32699           
- Partials      2385    2391    +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jurgenvinju jurgenvinju self-assigned this Apr 1, 2026
@jurgenvinju
Copy link
Copy Markdown
Member Author

Instrumented the slow path with a print. This is a small snippet of the result. It happens thousands of times during mvn test:

ALERT! looking up newGenerate in $parsergenerator$ before the end of initialization.
ALERT! looking up label in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up label in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up conditional in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up conditional in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up seq in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up seq in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up alt in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up alt in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up conditional in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up conditional in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up label in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up label in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up conditional in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up conditional in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up seq in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up seq in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up alt in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up alt in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up conditional in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up conditional in lang::rascal::grammar::definition::Symbols before the end of initialization.
ALERT! looking up modules2grammar in $parsergenerator$ before the end of initialization.
ALERT! looking up modules2grammar in $parsergenerator$ before the end of initialization.
ALERT! looking up newGenerate in $parsergenerator$ before the end of initialization.
ALERT! looking up modules2grammar in $parsergenerator$ before the end of initialization.
ALERT! looking up newGenerate in $parsergenerator$ before the end of initialization.
ALERT! looking up modules2grammar in $parsergenerator$ before the end of initialization.
ALERT! looking up modules2grammar in $parsergenerator$ before the end of initialization.
ALERT! looking up modules2grammar in $parsergenerator$ before the end of initialization.

When they are constructor names, the cache would not be a problem. If they are function names (possibly overloaded), it could be an issue.

@jurgenvinju
Copy link
Copy Markdown
Member Author

jurgenvinju commented Apr 1, 2026

This is the very first print:

Job: loading modules
ALERT! looking up callCount in lang::rascal::tests::basic::Memoization before the end of initialization.

That is a global variable but lookup always first tries to resolve a variable name as a function name. If the result is empty, then a lookup follows in the variable environment. If the function cache is on, the empty result is cached anyway. This is tricky, but doesn't seem to produce real issues (only if there is an erroneous double declaration of a global variable and a module-level (overloaded) function.

@jurgenvinju
Copy link
Copy Markdown
Member Author

jurgenvinju commented Apr 1, 2026

The second print is this:

ALERT! looking up int in lang::rascal::tests::basic::Matching before the end of initialization

Triggered by this code:

bool fT1(T1::\int()) = true;

The pattern matching initialization code has to look up the constructors. Because constructors are declared first during pre-loading, this lookup always succeeds. If any later overloads are added, the pattern matcher does not care since it only matches on constructors. This is not an issue for the cache.

@jurgenvinju
Copy link
Copy Markdown
Member Author

Another instance that is not about constructors:

private loc testLibraryLoc = |memory://myTestLibrary-<uuid().authority>/|;

Here uuid() comes from an imported module, which has been initialized by induction, so this is not an issue. It would have been an issue if uuid() were overloaded in the current module.

So the case of a global initializer, with calls to possibly overloaded functions declared in the current module is the most pressing we found so far. See also #2740 which addresses this issue on a different level. Both fixes are required to completely solve this.

@jurgenvinju
Copy link
Copy Markdown
Member Author

This is an interesting case with globals:

int singleChar(str s) = charAt(s,0);

list[int] makeValidSchemeChars() = [singleChar("a")..singleChar("z")] + [singleChar("A")..singleChar("Z")]
	+ [singleChar("0")..singleChar("9")] + [singleChar("+"), singleChar("-"), singleChar(".")]
	;
...
list[int] validHostChars = (validSchemeChars - [singleChar("+"), singleChar(".")]);
  • the function decl for makeValidSchemeChars has no issues because the function singleChar is only called when the entire module is loaded.
  • however with validHostChars the global uses a function which needs to be declared earlier.
  • if singleChar is not the first declaration, this lookup fails and if other overloads from imported/extended modules are present, these are bound.
  • this is still not a breaking case because accidentally the initializations are in the "right" order. But it is easy to see how this can break.

@jurgenvinju
Copy link
Copy Markdown
Member Author

Finally, we have:

ALERT! looking up priority in ParseTree before the end of initialization.

and

@synopsis{Nested priority is flattened.}
Production priority(Symbol s, [*Production a, priority(Symbol _, list[Production] b), *Production c])
  = priority(s,a+b+c);`

In this case overloading interacts with the cache. The use of priority to match a constructor is not problematic. However, every call to priority that does not see this overload yet, will never see the overload if the cache is on during initialization. This will lead to breakage of the parser generator for certain locations of the # reification operator (in global initialization expressions, for example).

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Apr 1, 2026

@jurgenvinju jurgenvinju merged commit 827298e into main Apr 1, 2026
9 checks passed
@jurgenvinju jurgenvinju deleted the fix/no-cache-during-loading branch April 1, 2026 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant