feat(torch): add generated operator bases by voltjia · Pull Request #622 · InfiniTensor/InfiniOps

voltjia · 2026-05-27T17:43:18Z

Summary

Add generated C++ operator base headers under src/base/, regenerated from the current PyTorch codegen output merged in PR feat(torch): expose optional codegen parameters #619.
Keep the generated base files flat under src/base/; no generator, test, wrapper, build-system, or CI files are changed in this PR.
Omit generated bases whose ATen schemas are known to vary across installed PyTorch builds, so they can remain generated by the local codegen environment instead of frozen as stable public bases.

Motivation

The PyTorch codegen work needs a checked-in operator-base layer that matches the current generator behavior, including optional-parameter overload support from PR #619. This PR contains only the generated public base headers, making the downstream base layer reviewable separately from the generator changes.

Closes # N/A — no dedicated issue.

Type of Change

feat — new feature / new operator / new platform.
N/A — fix — bug fix.
N/A — perf — performance improvement (no behavioral change).
N/A — refactor — code restructuring without behavior change.
N/A — test — adding or fixing tests only.
N/A — docs — documentation only.
N/A — build / ci — build system or CI configuration.
N/A — chore — tooling, formatting, or other non-code changes.
N/A — Breaking change.

Platforms Affected

CPU (WITH_CPU)
NVIDIA (WITH_NVIDIA)
Iluvatar (WITH_ILUVATAR)
MetaX (WITH_METAX)
Cambricon (WITH_CAMBRICON)
Moore (WITH_MOORE)
Ascend (WITH_ASCEND)
PyTorch C++ bindings (WITH_TORCH)
N/A — Build system / CMake / CI; no build-system or CI files are changed.
Python bindings / user-facing API

Test Results on Supported Platforms

All rows used a full bare python3 -m pytest -v run, without tests/, --devices, or -n. Each build regenerated PyTorch operator sources first, installed with WITH_TORCH=ON, and smoke-checked representative generated PyTorch operators after install. Build times are from the pip install phase recorded by the local validation runner; pytest times are from the timed pytest command; total time is build + pytest.

Platform	Built	`pytest` Result	Build	Pytest	Total	Notes / Hardware
NVIDIA	Yes	`9279 passed, 8565 skipped`	1126s	396s	1522s	Full bare pytest. PyTorch backend compiled and generated torch-op tests were included.
Iluvatar	Yes	`7777 passed, 8549 skipped`	821s	639s	1460s	Full bare pytest. PyTorch backend compiled and generated torch-op tests were included.
MetaX	Yes	`8771 passed, 7555 skipped`	1448s	436s	1884s	Full bare pytest. PyTorch backend compiled and generated torch-op tests were included.
Cambricon	Yes	`5974 passed, 9968 skipped`	2308s	999s	3307s	Full bare pytest. PyTorch backend compiled and generated torch-op tests were included.
Moore	Yes	`8537 passed, 7807 skipped`	2322s	671s	2993s	Full bare pytest. PyTorch backend compiled and generated torch-op tests were included.
Ascend	Yes	`7454 passed, 8830 skipped`	1125s	615s	1740s	Full bare pytest. PyTorch backend compiled and generated torch-op tests were included.

Full `pytest` output (optional)

NVIDIA:    9279 passed, 8565 skipped in 390.66s (0:06:30)
Iluvatar:  7777 passed, 8549 skipped in 635.02s (0:10:35)
MetaX:     8771 passed, 7555 skipped in 418.39s (0:06:58)
Cambricon: 5974 passed, 9968 skipped in 990.75s (0:16:30)
Moore:     8537 passed, 7807 skipped in 662.66s (0:11:02)
Ascend:    7454 passed, 8830 skipped in 597.06s (0:09:57)

The test counts are expected to match the PyTorch codegen coverage from PR #619 because this PR only checks in generated base headers from that generator. The only observed difference from the latest PR #619 table is on Ascend: one tests/test_torch_ops.py inner case is skipped instead of passed:

tests/test_torch_ops.py::test_op[npu-dtype1-0.01-0.01-13x4-inner]

The generated inner base, binding metadata, and PyTorch backend source are identical between PR #619's generated output and this PR's checked-in base. Ascend still builds successfully, smoke checks show the PyTorch slot active for Ascend, and the full pytest run exits successfully. This is recorded as a non-blocking skip-count drift rather than a build or execution regression.

Benchmark / Performance Impact

N/A — this PR checks in generated base headers only. The table above records build and test wall time for each platform.

Notes for Reviewers

This PR is rebased on the latest master, after PR #619 was merged. The generated base files are intentionally checked in as generator output. File paths are kept flat under src/base/.

The generated bases intentionally omit src/base/all.h, src/base/any.h, and src/base/internal_scaled_mm.h in this PR because their ATen schemas vary across installed PyTorch builds; those forms are better regenerated by the local codegen environment instead of frozen as stable public bases.

Checklist

Title, Branch, and Commits

PR title follows Conventional Commits (e.g. feat(nvidia): …, fix(cuda/gemm): …).
Branch name follows <type>/xxx-yyyy-zzzz where <type> matches the PR title's Conventional Commits type and words are joined with hyphens (see CONTRIBUTING.md §Branches).
Each commit message follows Conventional Commits.
Small PR is a single squashable commit; or, for a large PR, every commit is meaningful, well-formed, and independently reviewable (see CONTRIBUTING.md §Pull Requests).
No stray merge commits from master — the branch is rebased cleanly on top of current master.
No fixup! / squash! / wip commits remain.

Scope and Design

Changes are minimal — nothing unrelated to the stated motivation was added (CONTRIBUTING.md §Code/General).
No dead code, commented-out blocks, debug prints, printf/std::cout/print(...) left behind, or TODO without an owner and issue link.
No unrelated formatting churn that would obscure the diff.
Public API changes are intentional, documented in this PR, and reflected in affected callers/tests.

General Code Hygiene

The code is self-explanatory; comments were added only where the why is non-obvious (CONTRIBUTING.md §Code/General).
Every modified or added file ends with a single trailing newline (CONTRIBUTING.md §Code/General).
No trailing whitespace, tab/space mixing, or stray BOMs.
Identifiers in comments and error messages are wrapped in backticks (e.g. the `seqlens_k` tensor) (CONTRIBUTING.md §Code/General).
All comments and error messages are in English (CONTRIBUTING.md §Code/General).
Comments and error messages are complete sentences — capitalized first letter, terminal punctuation — unless the language/framework convention says otherwise (CONTRIBUTING.md §Code/General; §Python).

C++ Specific

Python Specific

N/A — This PR does not modify Python files.

Testing

pytest was run locally on every supported platform that this PR can affect, and the results are recorded in the "Test Results" table above (CONTRIBUTING.md §Pull Requests).
N/A — Every supported platform was tested.
New functionality is covered by PR feat(torch): expose optional codegen parameters #619's generated PyTorch operator test harness and the all-platform full pytest runs recorded above.
N/A — This PR does not add Python tests.
N/A — This PR does not add flaky parallel-only tests.
N/A — This is not a bug-fix-only PR.

Build, CI, and Tooling

The project builds cleanly from a fresh directory on every supported platform listed above.
compile_commands.json still regenerates through the existing CMake/scikit-build configuration path.
N/A — No new backend / device was added.
Only one CUDA-like GPU backend is selectable at a time — the existing mutual-exclusion check in CMakeLists.txt is not broken.
Both CI workflows (clang-format.yml, ruff.yml) are expected to remain green; this PR only changes generated C++ headers under src/base/.
N/A — No new runtime dependency was added.

Documentation

N/A — No user workflow, build flag, or developer workflow documentation changed.
New generated operator bases are documented through their checked-in header signatures.
N/A — No user-visible breaking change is introduced.

Security and Safety

No secrets, access tokens, internal URLs, customer data, IP addresses, or personal hardware identifiers have been committed or included in this PR description.
N/A — No third-party code was added.
No unsafe pointer arithmetic, uninitialized reads, or missing bounds checks were introduced.

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from 9444f9c to 9864ff2 Compare May 27, 2026 19:51

voltjia force-pushed the feat/torch-operator-bases branch from 33e537c to ffc3d68 Compare May 27, 2026 19:54

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from 9864ff2 to c0db647 Compare May 27, 2026 20:27

voltjia force-pushed the feat/torch-operator-bases branch from ffc3d68 to d89ce8e Compare May 27, 2026 20:28

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from c0db647 to 3e3e319 Compare May 27, 2026 21:15

voltjia force-pushed the feat/torch-operator-bases branch 2 times, most recently from fe50963 to c5a3a38 Compare May 27, 2026 21:51

voltjia mentioned this pull request May 27, 2026

feat(torch): expose optional codegen parameters #619

Merged

56 tasks

voltjia force-pushed the feat/torch-operator-bases branch from c5a3a38 to 312cd42 Compare May 27, 2026 22:25

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from 3e3e319 to 2a5d6af Compare May 27, 2026 23:33

voltjia force-pushed the feat/torch-operator-bases branch 2 times, most recently from 34db70e to f5f6a15 Compare May 28, 2026 03:39

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from d41f01d to 9f591db Compare May 28, 2026 03:55

voltjia force-pushed the feat/torch-operator-bases branch from f5f6a15 to ee42c3c Compare May 28, 2026 03:56

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from 9f591db to 70094a1 Compare May 28, 2026 07:41

voltjia force-pushed the feat/torch-operator-bases branch from ee42c3c to 9299ffb Compare May 28, 2026 07:44

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from 70094a1 to 87e86ab Compare May 28, 2026 08:02

voltjia force-pushed the feat/torch-operator-bases branch from 9299ffb to 1c61728 Compare May 28, 2026 08:04

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from 87e86ab to e62e2b2 Compare June 1, 2026 07:55

voltjia force-pushed the feat/torch-operator-bases branch from 1c61728 to 63a85dc Compare June 1, 2026 07:55

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from e62e2b2 to 48a3f2c Compare June 1, 2026 08:17

voltjia force-pushed the feat/torch-operator-bases branch from 63a85dc to 846a477 Compare June 1, 2026 08:17

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from 48a3f2c to 5582e8a Compare June 1, 2026 10:53

voltjia force-pushed the feat/torch-operator-bases branch from 846a477 to 19cb477 Compare June 1, 2026 11:02

voltjia force-pushed the feat/torch-codegen-optional-overloads branch from 5582e8a to 4e3cd58 Compare June 2, 2026 08:22

voltjia force-pushed the feat/torch-operator-bases branch 2 times, most recently from 4b857e1 to d343d02 Compare June 2, 2026 09:21

feat(torch): add generated operator bases

e0e57a9

voltjia force-pushed the feat/torch-operator-bases branch from d343d02 to e0e57a9 Compare June 2, 2026 14:01

voltjia changed the base branch from feat/torch-codegen-optional-overloads to master June 2, 2026 14:03

voltjia requested a review from a team June 2, 2026 14:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(torch): add generated operator bases#622

feat(torch): add generated operator bases#622
voltjia wants to merge 1 commit into
masterfrom
feat/torch-operator-bases

voltjia commented May 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

voltjia commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Type of Change

Platforms Affected

Test Results on Supported Platforms

Benchmark / Performance Impact

Notes for Reviewers

Checklist

Title, Branch, and Commits

Scope and Design

General Code Hygiene

C++ Specific

Python Specific

Testing

Build, CI, and Tooling

Documentation

Security and Safety

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

voltjia commented May 27, 2026 •

edited

Loading