Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 42 additions & 2 deletions build_tools/_therock_utils/py_packaging.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

"""Utilities for producing Python packages."""

from typing import Callable, Sequence
from typing import Callable, Mapping, Sequence

import importlib.util
import re
Expand Down Expand Up @@ -486,9 +486,19 @@ def populate_devel_files(
*,
addl_artifact_names: Sequence[str] = (),
exclude_components: Sequence[str] = (),
component_include_overrides: Mapping[str, Sequence[str]] = {},
tarball_compression: bool = True,
Comment on lines 488 to 490
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

component_include_overrides uses a mutable default ({}) in the function signature. Even if not mutated today, this is a Python footgun and can lead to surprising cross-call behavior if anything ever writes to it. Prefer defaulting to None and normalizing to an empty dict inside the function body.

Copilot uses AI. Check for mistakes.
):
"""Populates all files that have not yet been materialized and symlink the rest."""
"""Populates all files that have not yet been materialized and symlink the rest.

``component_include_overrides`` re-admits narrow file patterns from
components listed in ``exclude_components``. Keys are component names;
values are glob patterns matched against relpaths. Used when an
excluded component is the canonical home for files that are still
needed in devel (e.g. the BLAS fortran interop shims live in ``test``
because the test binary links against them at runtime, yet devel
consumers also need them resolved through exported CMake targets).
"""
package_path = self.platform_dir
# Set up for what artifacts to include in the devel package, including
# any emitted runtime artifacts plus additional requested.
Expand Down Expand Up @@ -521,6 +531,36 @@ def _devel_artifact_filter(an: ArtifactName) -> bool:
dir_entry,
)

for component, include_globs in component_include_overrides.items():
if component not in excluded:
continue

def _override_filter(an: ArtifactName, _c=component) -> bool:
if an.name not in devel_artifact_names:
return False
if an.component != _c:
return False
if (
an.target_family != "generic"
and an.target_family != self.target_family
):
return False
return True

override_artifacts = self.params.filter_artifacts(
_override_filter, includes=tuple(include_globs)
)
log(f"::: Re-admitting {component} files matching {list(include_globs)}")
for an, an_path in override_artifacts.artifact_basedirs:
log(f" + {an}: {an_path}")
for relpath, dir_entry in override_artifacts.pm.matches():
dest_path = package_path / relpath
self._populate_devel_file(
relpath,
dest_path,
dir_entry,
)
Comment on lines +550 to +562
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The override pass filters matches via includes=..., which usually means directory entries (e.g. lib/) will not be yielded. _populate_devel_file() assumes parent directories already exist for some branches (notably the soname-alias and symlink transcription paths), so this override flow can fail when the matched file is a symlink/alias under a directory that wasn't created by the main pass. Consider ensuring parent dirs exist before calling _populate_devel_file for override matches (or make _populate_devel_file create parents in all branches), and also consider skipping directory entries or tolerating pre-existing directories to avoid exist_ok=False failures if an override glob ever matches directories.

Copilot uses AI. Check for mistakes.
Comment on lines +534 to +562
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior (component_include_overrides) materially changes what lands in the devel package and is easy to regress. There are existing Python packaging integration tests in build_tools/tests/py_packaging_test.py, but none exercise devel population or the override path. Please add a test that builds a minimal artifact tree with an excluded component (e.g. test) containing both an allowed file and a disallowed file, runs populate_devel_files(..., exclude_components=[...], component_include_overrides={...}), and asserts only the allowed file is present in the devel output.

Copilot uses AI. Check for mistakes.

# For packaging, the devel platform/ contents are not wheel safe, so we
# store them into their own tarball and dynamically decompress at runtime.
# The tarball will contain as its first path component the top level
Expand Down
9 changes: 9 additions & 0 deletions build_tools/build_python_packages.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,15 @@ def _run_kpack_split(
"nlohmann-json",
],
exclude_components=["test"],
# hipSOLVER's fortran interop shim lives in the `test` component
# alongside `hipsolver-test`, which links against it at runtime. Devel
# consumers also need it: `hipsolver-targets.cmake` exports
# `roc::hipsolver_fortran` with an IMPORTED_LOCATION pointing at the
# shim, and CMake's generated import-check fails the configure step
# of downstream projects (e.g. PyTorch) when the file is absent.
component_include_overrides={
"test": ["lib/libhipsolver_fortran.so*"],
},
Comment on lines +152 to +160
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right approach. If this library is used by downstream projects (thus outside of tests), it does not belong in a test artifact.

Why are we only spotting this with multi-arch?

  • Devel package excludes test component in kpack-split mode since
    generic test binaries cannot run without device code

Okay... but are there other issues, or just this one?

Moving ther shim into the the lib artifact is not a viable solution as this would introduce a runtime dependency on libgfortran for every consumer. Moving to dev would break tests.

I'm not following the reasoning here. What's the difference between including in lib and what this code does with an include override?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when we packages it only was used for testing, as there have been no PyTorch build done by us. Idk when / why PyTorch picks this up and it definitely needs a rework.

Devel package excludes test component in kpack-split mode since
generic test binaries cannot run without device code

Okay... but are there other issues, or just this one?

Actually, idk and would need to go deeper down the rabbit hole here.

Moving ther shim into the the lib artifact is not a viable solution as this would introduce a runtime >>dependency on libgfortran for every consumer. Moving to dev would break tests.

I'm not following the reasoning here. What's the difference between including in lib and what this code >does with an include override?

If you add it to lib, rocm-sdk test will fail if you haven't gfortran installed. This broke several times in the past, see #1418 and #1877.

I agree that this is not a viable long term fix but it mimics what is done in the legacy builds. Those actually include the lib in rocm-sdk-devel. I can file an issue to kick of some further analysis and to root cause why we have the CMake entry point and why PyTorch picks it up.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some more digging, @stellaraccident can you help us figure out who / which team to route this to and make a decision on how to unbreak the pytorch build? I'd rather not introduce a new type of artifact filtering for python packaging.

cc a few people from ROCm/hipSOLVER#312 : @cgmb @tfalders @EdDAzevedo

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to mitigate this and an alternative to this PR is ROCm/rocm-libraries#6530.

tarball_compression=args.devel_tarball_compression,
)
if args.build_packages:
Expand Down
Loading