feat(torch): add generated operator bases#622
Open
voltjia wants to merge 1 commit into
Open
Conversation
9444f9c to
9864ff2
Compare
33e537c to
ffc3d68
Compare
9864ff2 to
c0db647
Compare
ffc3d68 to
d89ce8e
Compare
c0db647 to
3e3e319
Compare
fe50963 to
c5a3a38
Compare
56 tasks
c5a3a38 to
312cd42
Compare
3e3e319 to
2a5d6af
Compare
34db70e to
f5f6a15
Compare
d41f01d to
9f591db
Compare
f5f6a15 to
ee42c3c
Compare
9f591db to
70094a1
Compare
ee42c3c to
9299ffb
Compare
70094a1 to
87e86ab
Compare
9299ffb to
1c61728
Compare
87e86ab to
e62e2b2
Compare
1c61728 to
63a85dc
Compare
e62e2b2 to
48a3f2c
Compare
63a85dc to
846a477
Compare
48a3f2c to
5582e8a
Compare
846a477 to
19cb477
Compare
5582e8a to
4e3cd58
Compare
4b857e1 to
d343d02
Compare
d343d02 to
e0e57a9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
src/base/, regenerated from the current PyTorch codegen output merged in PR feat(torch): expose optional codegen parameters #619.src/base/; no generator, test, wrapper, build-system, or CI files are changed in this PR.Motivation
The PyTorch codegen work needs a checked-in operator-base layer that matches the current generator behavior, including optional-parameter overload support from PR #619. This PR contains only the generated public base headers, making the downstream base layer reviewable separately from the generator changes.
Closes # N/A — no dedicated issue.
Type of Change
feat— new feature / new operator / new platform.fix— bug fix.perf— performance improvement (no behavioral change).refactor— code restructuring without behavior change.test— adding or fixing tests only.docs— documentation only.build/ci— build system or CI configuration.chore— tooling, formatting, or other non-code changes.Platforms Affected
WITH_CPU)WITH_NVIDIA)WITH_ILUVATAR)WITH_METAX)WITH_CAMBRICON)WITH_MOORE)WITH_ASCEND)WITH_TORCH)Test Results on Supported Platforms
All rows used a full bare
python3 -m pytest -vrun, withouttests/,--devices, or-n. Each build regenerated PyTorch operator sources first, installed withWITH_TORCH=ON, and smoke-checked representative generated PyTorch operators after install. Build times are from thepip installphase recorded by the local validation runner; pytest times are from the timed pytest command; total time isbuild + pytest.pytestResult9279 passed, 8565 skipped7777 passed, 8549 skipped8771 passed, 7555 skipped5974 passed, 9968 skipped8537 passed, 7807 skipped7454 passed, 8830 skippedFull `pytest` output (optional)
The test counts are expected to match the PyTorch codegen coverage from PR #619 because this PR only checks in generated base headers from that generator. The only observed difference from the latest PR #619 table is on Ascend: one
tests/test_torch_ops.pyinnercase is skipped instead of passed:The generated
innerbase, binding metadata, and PyTorch backend source are identical between PR #619's generated output and this PR's checked-in base. Ascend still builds successfully, smoke checks show the PyTorch slot active for Ascend, and the full pytest run exits successfully. This is recorded as a non-blocking skip-count drift rather than a build or execution regression.Benchmark / Performance Impact
N/A — this PR checks in generated base headers only. The table above records build and test wall time for each platform.
Notes for Reviewers
This PR is rebased on the latest
master, after PR #619 was merged. The generated base files are intentionally checked in as generator output. File paths are kept flat undersrc/base/.The generated bases intentionally omit
src/base/all.h,src/base/any.h, andsrc/base/internal_scaled_mm.hin this PR because their ATen schemas vary across installed PyTorch builds; those forms are better regenerated by the local codegen environment instead of frozen as stable public bases.Checklist
Title, Branch, and Commits
feat(nvidia): …,fix(cuda/gemm): …).<type>/xxx-yyyy-zzzzwhere<type>matches the PR title's Conventional Commits type and words are joined with hyphens (seeCONTRIBUTING.md§Branches).CONTRIBUTING.md§Pull Requests).master— the branch is rebased cleanly on top of currentmaster.fixup!/squash!/wipcommits remain.Scope and Design
CONTRIBUTING.md§Code/General).printf/std::cout/print(...)left behind, orTODOwithout an owner and issue link.General Code Hygiene
CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General).the `seqlens_k` tensor) (CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General; §Python).C++ Specific
clang-format --dry-run --Werrorpasses on all modifiedsrc/base/*.hfiles.clang-tidyconcerns (per.clang-tidy) have been reviewed — no new warnings beyond the existing baseline.CONTRIBUTING.md§C++).assertwith messages that include at least__FILE__,__LINE__, and__func__(CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).src/base/<op>.h(inheritingOperator<Op>) with generated PyTorch backends provided by PR feat(torch): expose optional codegen parameters #619 (CONTRIBUTING.md§Adding an Operator).new/delete; RAII / smart pointers / existing allocators are used.Python Specific
Testing
pytestwas run locally on every supported platform that this PR can affect, and the results are recorded in the "Test Results" table above (CONTRIBUTING.md§Pull Requests).Build, CI, and Tooling
compile_commands.jsonstill regenerates through the existing CMake/scikit-build configuration path.CMakeLists.txtis not broken.clang-format.yml,ruff.yml) are expected to remain green; this PR only changes generated C++ headers undersrc/base/.Documentation
Security and Safety