Skip to content

Add OnnxDiscrepancyCheck speedup metric with default timing updates#2502

Open
xadupre wants to merge 8 commits into
mainfrom
xadupre/speedup
Open

Add OnnxDiscrepancyCheck speedup metric with default timing updates#2502
xadupre wants to merge 8 commits into
mainfrom
xadupre/speedup

Conversation

@xadupre

@xadupre xadupre commented Jun 5, 2026

Copy link
Copy Markdown
Member

Describe your changes

Added speedup measurement for OnnxDiscrepancyCheck and updated behavior based on review feedback:

  • Changed timing_iterations default from 10 to 5.
  • If timing_iterations is set to 0, speedup measurement is skipped.
  • Added unit tests to validate the new default and the skip behavior.
  • Fixed test mocks to properly configure device attribute for compare_generation tests.

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

Comment thread olive/passes/onnx/discrepancy_check.py
Copilot AI changed the title add speedup to OnnxDiscrepancyCheck Add OnnxDiscrepancyCheck speedup metric with default timing updates Jun 5, 2026
@xadupre xadupre marked this pull request as ready for review June 5, 2026 17:28
Copilot AI review requested due to automatic review settings June 5, 2026 17:28
Comment thread test/passes/onnx/test_discrepancy_check.py Fixed
Comment thread test/passes/onnx/test_discrepancy_check.py Fixed

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the OnnxDiscrepancyCheck pass by adding an inference speedup measurement (ONNX vs PyTorch) and introducing configurable warmup/timing iteration settings, with updated defaults and tests to validate the new behavior.

Changes:

  • Added warmup_iterations and timing_iterations config parameters (defaulting timing_iterations to 5) and implemented speedup measurement with an option to skip when timing_iterations=0.
  • Updated session/device setup to target the configured accelerator (with CPU fallback) and run PyTorch on the matched torch device.
  • Added unit tests to validate the new default and the skip behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
olive/passes/onnx/discrepancy_check.py Adds speedup timing, new config params/defaults, and device-aware session/model handling.
test/passes/onnx/test_discrepancy_check.py Adds tests for the new timing_iterations default and the “skip when 0” behavior.

Comment thread olive/passes/onnx/discrepancy_check.py
Comment on lines +369 to 371
return speedup

def compare_generation(self, config: type[BasePassConfig], ref_model) -> int:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already addressed in commit 595bffc. input_ids is moved to ref_model.device (line 389) and the output is moved back to CPU via .cpu() before .tolist() (line 398).

Copilot AI requested a review from xiaoyu-work June 5, 2026 20:47
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@xadupre

xadupre commented Jun 11, 2026

Copy link
Copy Markdown
Member Author

@copilot test/passes/onnx/test_discrepancy_check.py:105: in test_compare_generation_returns_common_prefix_length
result = pass_instance.compare_generation(config, mock_ref_model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
olive/passes/onnx/discrepancy_check.py:389: in compare_generation
input_ids = input_ids.to(ref_model.device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E TypeError: to() received an invalid combination of arguments - got (MagicMock), but expected one of:
E * (torch.device device = None, torch.dtype dtype = None, bool non_blocking = False, bool copy = False, *, torch.memory_format memory_format = None)
E * (torch.dtype dtype, bool non_blocking = False, bool copy = False, *, torch.memory_format memory_format = None)
E * (Tensor tensor, bool non_blocking = False, bool copy = False, *, torch.memory_format memory_format = None)

Copilot AI commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@copilot test/passes/onnx/test_discrepancy_check.py:105: in test_compare_generation_returns_common_prefix_length
result = pass_instance.compare_generation(config, mock_ref_model)
^^...

Fixed in 25c6ab4. Added mock_ref_model.device = torch.device("cpu") to both compare_generation test cases so that the mock returns a proper torch.device object instead of a MagicMock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants