feat(tesseract): Support separate pre-aggregations for different multi-stage subqueries by waralexrom · Pull Request #10684 · cube-js/cube

waralexrom · 2026-04-14T12:56:11Z

Check List

Tests have been run in packages where changes have been made if available
Linter has been run for changed code
Tests for the changes have been added if not covered yet
Docs have been added / updated if required

…i-stage subqueries

claude · 2026-04-14T12:56:30Z

Claude finished @waralexrom's task in 5m 17s —— View job

Code Review: PR #10684 — Support separate pre-aggregations for different multi-stage subqueries

Gather context and read PR diff
Review Rust changes (Tesseract planner)
Review TypeScript changes (schema-compiler, query-orchestrator)
Review test coverage
Post comprehensive review

Summary

This PR enables Tesseract's multi-stage query engine to use different pre-aggregations for different subqueries within the same top-level query. Previously, all multi-stage leaf measures had to match a single pre-aggregation, or the entire optimization was skipped. Now each leaf measure can independently match its own pre-aggregation, and the system generates unique table placeholders (__usage_N) that get resolved to the appropriate physical tables at query execution time.

The change spans three layers:

Rust planner — optimizer tracks per-usage pre-aggregation metadata, generates __usage_N suffixed placeholders, extracts per-usage date ranges (including time-shift awareness)
Schema compiler (TS) — handles the new array-based multi-usage response format from Rust, maps each usage to pre-aggregation descriptions with merged date ranges
Query orchestrator (TS) — resolves usage-specific placeholders to physical partition table names

Code Quality

Strengths:

The rollback mechanism in optimizer.rs (saving/restoring usages.len() and usage_counter) is well-implemented for atomicity — either all subqueries match pre-aggregations or none do
Backward compatibility is well-preserved: single-usage queries strip the __usage_N suffix and return the old-style pre-aggregation object format
Clean separation of concerns between the Rust planner (generating placeholders) and TS orchestrator (resolving them)
The replacement ordering in QueryCache.ts (usage-specific before base) correctly prevents substring collision

Concerns:

See inline comments for specific issues around the Join variant in source_with_usage_index, the extract_date_range first-match behavior, and the redundant condition in base_query.rs

Security

No security concerns identified. The changes are internal to the query planning pipeline and don't introduce any new external inputs, SQL injection vectors, or authentication/authorization changes.

Performance

The String::replace in base_query.rs:97-98 for single-usage backward compat is a full scan of the SQL string, but this is negligible since it only runs for single-usage queries
The partition filtering loop in PreAggregationPartitionRangeLoader.ts for per-usage date ranges is O(partitions × usages), which is acceptable given typical cardinalities
group_usages in base_query.rs uses a HashMap and sorts the output — also fine for expected sizes

No performance bottlenecks identified.

Testing

Coverage is good for the happy path:

2 Rust integration tests: separate pre-aggs (same cube, different measures), and time-shift with different date ranges
2 TypeScript integration tests: two multi-stage measures with different pre-aggs, and time-shift partition loading
Snapshot tests verify correct SQL output
Test schemas are well-constructed to exercise the feature

Potential gaps:

Rollback path: No explicit test where one subquery matches a pre-aggregation but another doesn't, verifying the optimizer correctly rolls back to no pre-aggregations at all
Join pre-aggregations in multi-stage: The source_with_usage_index has a Join branch that doesn't set usage_index — no test covers this case
Edge case — same pre-aggregation used by multiple subqueries: Both subqueries matching the same pre-aggregation would create two usages with different indices for the same physical table. This scenario works in the grouping logic (group_usages), but an explicit test would build confidence
Empty date range handling: When extract_date_range finds no date range filters, None is propagated — worth testing that partition loading handles this correctly for multi-usage scenarios

Documentation

The PR doesn't include documentation updates. Since this is an internal planner optimization that doesn't change any user-facing API or configuration, that's acceptable. The code comments explaining the rollback logic and backward-compat choices are helpful.

Overall Assessment

This is a well-structured feature that enables an important optimization for multi-stage queries. The architecture (usage indices, placeholder suffixes, grouped usage info) is clean and extensible. The backward-compatibility strategy is sound.

The main areas for improvement are: (1) adding tests for the rollback/failure path, (2) clarifying the Join variant behavior in source_with_usage_index, and (3) minor code cleanup noted in inline comments. See the 8 inline comments for specific details.

rust/cubesqlplanner/cubesqlplanner/src/logical_plan/optimizers/pre_aggregation/optimizer.rs

rust/cubesqlplanner/cubesqlplanner/src/planner/base_query.rs

packages/cubejs-schema-compiler/test/integration/utils/BaseDbRunner.ts

packages/cubejs-query-orchestrator/src/orchestrator/QueryCache.ts

packages/cubejs-schema-compiler/src/adapter/BaseQuery.js

packages/cubejs-query-orchestrator/src/orchestrator/PreAggregationPartitionRangeLoader.ts

rust/cubesqlplanner/cubesqlplanner/src/planner/top_level_planner.rs

codecov · 2026-04-14T13:07:17Z

Codecov Report

❌ Patch coverage is 51.66667% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.79%. Comparing base (6f79820) to head (2b4fad1).
⚠️ Report is 20 commits behind head on master.

Files with missing lines	Patch %	Lines
...orchestrator/PreAggregationPartitionRangeLoader.ts	0.00%	23 Missing ⚠️
...-query-orchestrator/src/orchestrator/QueryCache.ts	0.00%	4 Missing ⚠️
...ejs-schema-compiler/src/adapter/PreAggregations.ts	90.90%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master   #10684       +/-   ##
===========================================
+ Coverage   57.84%   78.79%   +20.95%     
===========================================
  Files         215      465      +250     
  Lines       16609    91962    +75353     
  Branches     3336     3363       +27     
===========================================
+ Hits         9607    72461    +62854     
- Misses       6514    19010    +12496     
- Partials      488      491        +3

Flag	Coverage Δ
cube-backend	`57.99% <51.66%> (+0.14%)`	⬆️
cubesql	`83.41% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

waralexrom added 13 commits April 9, 2026 16:47

feat(tesseract): Support separate pre-aggregations for different mult…

4f80d07

…i-stage subqueries

in work

6b69875

in work

e7db7e3

in work

01ef908

in work

42cf36f

in work

f81988e

in work

2c07309

in work

2bf4b01

in work

93becb3

fmt

14ce271

fix

0f72f20

fix

f838708

fix

1f4cfe8

waralexrom requested review from a team as code owners April 14, 2026 12:56

github-actions bot added rust Pull requests that update Rust code javascript Pull requests that update Javascript code labels Apr 14, 2026

vercel bot deployed to Preview April 14, 2026 12:57 View deployment