[WIP][EP] Make uccl.ep a self-contained Python subpackage (.so + Python code)#888
[WIP][EP] Make uccl.ep a self-contained Python subpackage (.so + Python code)#888zhenhuang12 wants to merge 19 commits intomainfrom
Conversation
7fa4c37 to
217c5fd
Compare
|
There seems to be some conflict, cc @zhenhuang12 |
f056922 to
9f26d95
Compare
| else: | ||
| try: | ||
| path.unlink() | ||
| except FileNotFoundError: |
…hon code Previously the uccl wheel only shipped the ep*.so native extension, while the Python-level functionality (Buffer, EventOverlap, initialize_uccl, etc.) lived in ep/deep_ep_wrapper and ep/bench/ — outside the uccl package. Users had to install deep_ep_wrapper separately to get the Python API. This commit restructures ep into a proper uccl.ep subpackage: 1. Rename C++ NB_MODULE from 'ep' to '_ep_native' so it becomes uccl.ep._ep_native (avoids shadowing the Python package). 2. Create uccl/ep/ with: - __init__.py: re-exports all native symbols + Python wrappers - buffer.py: the full-featured Buffer class (from ep/bench/buffer.py) - utils.py: EventOverlap, initialize_uccl, destroy_uccl, etc. 3. Update build system: - ep/setup.py: builds _ep_native, installs to site-packages/uccl/ep/ - ep/Makefile: same naming/path changes - build_inner.sh: copies .so to uccl/ep/ instead of uccl/ - Root setup.py: declares uccl.ep package_data - MANIFEST.in: includes uccl/ep/ files 4. Update deep_ep_wrapper: now purely re-exports from uccl.ep (no own logic). 5. Update ep/bench/ test scripts to use new import paths. After this change, 'pip install uccl' provides the complete EP API: from uccl.ep import Buffer, Config, EventHandle, initialize_uccl No separate deep_ep_wrapper installation is needed. Co-authored-by: zhenhuang12 <zhenhuang12@users.noreply.github.com>
ep/bench/buffer.py and ep/bench/utils.py were full copies of the
implementations now canonical in uccl/ep/buffer.py and uccl/ep/utils.py.
Having two copies means maintaining the same code in two places.
Replace them with thin shim modules that re-export everything from
uccl.ep.{buffer,utils}. The bench test scripts (test_low_latency.py,
test_intranode.py, etc.) continue to do 'from buffer import Buffer'
and 'from utils import ...' which resolves through the shims.
This establishes uccl/ep/ as the single source of truth.
Co-authored-by: zhenhuang12 <zhenhuang12@users.noreply.github.com>
…of truth)
The previous approach duplicated buffer.py and utils.py in both
uccl/ep/ and ep/bench/. This commit eliminates the duplication by
establishing a clear convention:
- Source of truth: ep/python/uccl_ep/{__init__,buffer,utils}.py
(lives alongside the ep module's C++/CUDA sources)
- Installed as: uccl.ep (via setup.py package_dir mapping)
How it works:
1. Root setup.py uses package_dir={'uccl.ep': 'ep/python/uccl_ep'}
so setuptools reads .py files directly from the ep module tree.
2. build_inner.sh copies the compiled _ep_native*.so into
ep/python/uccl_ep/ before running 'python -m build', so both
.py and .so end up in the wheel.
3. ep/setup.py CustomInstall copies both .so and .py files to
site-packages/uccl/ep/ for direct-install workflows.
4. ep/Makefile install target does the same for make-based workflows.
5. ep/bench/{buffer,utils}.py remain thin shims that re-export
from uccl.ep, so bench scripts work unchanged.
The uccl/ep/ directory is no longer checked into the repository;
it only exists in the installed site-packages.
Co-authored-by: zhenhuang12 <zhenhuang12@users.noreply.github.com>
_ep_native -> ep_cpp
@YangZhou1997 Thanks! I'm continuing to support this feature—the build changes are fairly large and I'd appreciate a review. I've moved the compilation of cc @MaoZiming |
9a87824 to
9bae94c
Compare
|
@zhenhuang12 is it possible to avoid using a Makefile, which is usually less maintainable than a shell script like build_inner.sh? |
@YangZhou1997 I think we could replace the Makefile with a Do you think using build.sh instead of Makefile is a better approach? I'd like to hear your opinion. |
|
@YangZhou1997 If you think the current PR change is too large, I can revert it and only keep the "ep python self-containerd" part. Support for |
|
Hi @zhenhuang12, the PR size is okay for me. I think if we can use a ShellExtension in the setup.py, that would be better, so that we can keep all building functions in the existing |
Description
Turns
uccl.epinto a self-contained Python subpackage: the native extension (ep_cpp.abi3.so, renamed from the old top-levelep.abi3.soso it no longer collides with theuccl.eppackage name) now lives alongside the Python helpers (Buffer,EventOverlap,initialize_uccl,destroy_uccl, …) underuccl/ep/. The Python source of truth moves toep/python/uccl_ep/and is exposed viapackage_dir={"uccl.ep": "ep/python/uccl_ep"}.ep/bench/{buffer,utils}.pyandep/deep_ep_wrapper/deep_ep/are reduced to thinre-export shims over
uccl.ep, so existing bench scripts anddeep_epimports keep working. After this changepip install ucclis enough toimport uccl.epand use the full Python API.Important
deep_ep_wrapperno longer needs to be reinstalled when theuccl.epnative library changes.Previously
deep_ep_wrappershipped its own full copy ofbuffer.py/utils.pythat called into the native extension directly, so any change to theuccl.epdynamic-library API (symbol rename, signature change, new helper, …) required rebuilding and reinstallingdeep_ep_wrapperto stay in sync. With this PR,deep_ep_wrappercontains no implementation of its own — it simply re-exports fromuccl.epat import time. Upgradingucclalone is sufficient; the existingdeep_ep_wrapperinstall picks up the new API automatically as long as the public names it re-exports remain stable.Not a behavior change
Only packaging / module layout is changing. The native C++/CUDA code, the Python
Bufferclass,initialize_uccl/destroy_uccl, benchmark scripts, and CI workflows all keep the same public semantics.Fixes # (issue)
Type of Change
How Has This Been Tested?
Checklist
format.shto follow the style guidelines.build.shto verify compilation.