feat(torch): expose optional codegen parameters#619
Conversation
5e043a8 to
d9714e7
Compare
d9714e7 to
d5b04cc
Compare
|
Generated source/header archive for review:
|
d5b04cc to
109b72f
Compare
|
Additional compatibility validation for the latest commit (
Remaining skipped existing base headers in the overlay are schema/name compatibility warnings rather than build failures: |
109b72f to
fd11775
Compare
|
Rebased onto current Additional validation after rebase:
The rebase exposed a wrapper dispatch generation issue for overloads that reuse the same optional parameter names across scalar and Tensor variants ( |
|
Latest operator-bases overlay validation for
Remaining generation warnings are existing base/schema drift cases that are skipped rather than emitted as broken code: |
fd11775 to
9444f9c
Compare
e62e2b2 to
48a3f2c
Compare
48a3f2c to
5582e8a
Compare
5582e8a to
4e3cd58
Compare
|
请 @crapromer 初审,@Ziminli 终审。 |
Summary
src/base/<op>.hoverloads when available, forwarding omitted optional/default ATen parameters as typed defaults.std::optional<T>support to operator cache hashing and update the generated torch-op test harness for optional arguments and known vendor-specific PyTorch crashes/divergences.Motivation
The PyTorch code generator previously hid optional ATen schema parameters and always forwarded typed
nulloptvalues. That made generated APIs unable to exercise non-default optional behavior and caused drift against operator base headers that intentionally expose optional parameters. This PR makes optional schema handling explicit while keeping existing hand-written bases as the public API source of truth when they are present.Closes # N/A — this is follow-up work from the PyTorch codegen/base drift discussion.
Type of Change
feat— new feature / new operator / new platformfix— bug fix.perf— performance improvement (no behavioral change).refactor— code restructuring without behavior change.test— adding or fixing tests only.docs— documentation only.build/ci— build system or CI configuration.chore— tooling, formatting, or other non-code changes.Platforms Affected
WITH_CPU)WITH_NVIDIA)WITH_ILUVATAR)WITH_METAX)WITH_CAMBRICON)WITH_MOORE)WITH_ASCEND)WITH_TORCH)Test Results on Supported Platforms
All runs were rebased on current
master, generated PyTorch operator sources before build, installed withWITH_TORCH=ON, and ran full verbose pytest aspython3 -m pytest -vwithouttests/,--devices, or-n.pytestResult9279 passed, 8565 skipped7777 passed, 8549 skipped8771 passed, 7555 skipped5974 passed, 9968 skipped8537 passed, 7807 skipped7455 passed, 8829 skippedValidation details
The test counts differ from earlier PR-body snapshots because this branch was rebased after the generated operator-base stack and now runs full bare pytest with
WITH_TORCH=ON. PyTorch-backed tests are collected and executed on every platform.All checks passed on the rebased branch.
Benchmark / Performance Impact
N/A — this PR changes generated API/backend plumbing and tests. The table above records build and test wall time for each platform to support follow-up compile-time optimization work.
Notes for Reviewers
Downstream PR feat(torch): add generated operator bases #622 was regenerated from this PR after the public C++ parameter-name fix (
selfremains an ATen schema name internally, while generated public C++ signatures useinput) and passed full-platform validation withWITH_TORCH=ON. Those results are recorded on PR feat(torch): add generated operator bases #622 to avoid mixing downstream generated-base changes into this PR's own table.Existing
src/base/<op>.hoverloads are treated as the public API when present. The generator binds compatible overloads to ATen schema parameters and fills omitted optional/default schema parameters at the ATen call site.Generated fresh bases now expose supported optional types as
std::optional<...>. PyTorch-internal optional types without stable InfiniOps representations remain hidden and are forwarded as typed empty optionals.The generator reads the locally installed PyTorch
torchgenpackagednative_functions.yaml, so generated op availability follows the PyTorch schema available in the build environment.The test harness skips only known vendor-kernel crashes/divergences that otherwise terminate the Python process or compare mismatched vendor paths; PyTorch-backed tests are still collected and executed on every platform.
Checklist
Title, Branch, and Commits
feat(nvidia): …,fix(cuda/gemm): …).<type>/xxx-yyyy-zzzzwhere<type>matches the PR title's Conventional Commits type and words are joined with hyphens (seeCONTRIBUTING.md§Branches).CONTRIBUTING.md§Pull Requests).master— the branch is rebased cleanly on top of the currentmaster.fixup!/squash!/wipcommits remain.Scope and Design
CONTRIBUTING.md§Code/General).printf/std::cout/print(...)left behind, orTODOwithout an owner and issue link.General Code Hygiene
CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General).the `seqlens_k` tensor) (CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General; §Python).C++ Specific
clang-format --dry-run --Werror src/hash.hpasses.clang-tidywas not run; no kernel or algorithm implementation path is added.CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).src/base/<op>.hor platform implementation directories.new/delete; RAII / smart pointers / existing allocators are used.Python Specific
ruff checkpasses cleanly.ruff format --checkpasses cleanly.CONTRIBUTING.md§Python).CONTRIBUTING.md§Python).CONTRIBUTING.md§Python).if,for, and similar control-flow statements (CONTRIBUTING.md§Python).return, except when it directly follows a control-flow statement likeiforfor(CONTRIBUTING.md§Python).Testing
WITH_TORCH=ON.tests/.pytest.mark.parametrizecorrectly.pytest.mark.auto_act_and_assertis not used by the generator unit tests or generated torch-op harness touched here.dtype/deviceparameterization is relied on, or overridden with an explicitpytest.mark.parametrizewhen necessary.Build, CI, and Tooling
compile_commands.jsonstill regenerates through the existing CMake/scikit-build configuration path.CMakeLists.txtis not changed.ruffandclang-formatchecks are green.pyproject.toml's[project.optional-dependencies].Documentation
Security and Safety