Skip to content
Open
Show file tree
Hide file tree
Changes from 84 commits
Commits
Show all changes
188 commits
Select commit Hold shift + click to select a range
0cd109a
Add more cuda function to load
Feb 28, 2026
bbe25ab
Add _NBL_COMPILE_WITH_CUDA_ compile definition on CMakeLists.txt
Feb 28, 2026
d74349e
Move CCudaHandler constructor to cpp and query device info and attrib…
Feb 28, 2026
38ed6db
Add missing CFileView.h header in CCudaHandler.cpp
Feb 28, 2026
95338cd
Fix indentation of CCudaHandler.cpp
Feb 28, 2026
3e9dfd2
Add NBL_API2 to CCudaHandler
Feb 28, 2026
1ae7747
Fix fetching deviceUUID logic
Feb 28, 2026
a3150dc
Fix usage of CFileView
Feb 28, 2026
5018be7
Fix use after move of ptx cpuBuffer
Feb 28, 2026
5251b4d
Improve cpuBuffer initialization using params instead of aggregrate i…
Feb 28, 2026
d655b19
Fix indentation of CCudaHandler.cpp into tabs
Feb 28, 2026
454710b
Iterate m_availableDevices when creatingDevice
Feb 28, 2026
4645bc4
Implement context creation in CCUDADevice
Feb 28, 2026
3172ae7
Implement physical device getExternalMemoryProperties
Mar 12, 2026
f9b8b4f
Dedicated buffer and image
Mar 12, 2026
a2357e2
External Memory Feature flags should not be enum class
Mar 13, 2026
0d9c3d8
External Vulkan Buffer Creation
Mar 13, 2026
89f5ae5
Temporary enable compile with cuda flag
Mar 13, 2026
152830f
Update examples_tests submodule to vk_cuda interop demo branch
Mar 14, 2026
ea3b49b
External memory allocation
Mar 16, 2026
77b92ab
Fix indentation on CAssetConverter.cpp
Mar 16, 2026
68f740f
Update jitify submodule
Mar 16, 2026
1c93a91
External memory allocation cleanup
Mar 16, 2026
ae0e177
Implement proper CCUDADevice destructor.
Mar 16, 2026
c83942a
Implementation of Shared memory between vulkan and cuda
Mar 17, 2026
2e45702
Add NBL_API2 modifier to CCUDADevice
Mar 17, 2026
741252f
Implementation of Shared semaphore between Vulkan and CUDA
Mar 17, 2026
fe75ce0
Update to CUDA Toolkit version 13.0+
Mar 24, 2026
78fc0df
Fix external semaphore
Mar 24, 2026
5d19c5b
External image implementation
Mar 24, 2026
f23b30c
Remove unnecessary inline modifier
Mar 24, 2026
e50c85e
Remove unused code in CCUDADevice
Mar 25, 2026
a9c2d85
Fix importSemaphore for unix
Mar 25, 2026
e7ff325
Remove searching for old nvrtc version
Mar 25, 2026
c244b77
Fix filling dstQueueFamilyIndex
Mar 25, 2026
d24acf9
Update cuda toolkit requirement in cmake
Mar 25, 2026
ff82800
Improve external semaphore handle management
kevyuu Apr 15, 2026
7b48605
Improve win32HandleMetadata parameter so it is more readable
kevyuu Apr 15, 2026
24ba36e
Refactor CCUDASharedMemory to use ExternalHandleType
kevyuu Apr 16, 2026
5b4fc27
Refactor ExternalHandleType
kevyuu Apr 17, 2026
fb66f3a
Small fix to use CloseExternalHandle
kevyuu Apr 17, 2026
47ba7e4
Remove CCUDASharedMemory::exportAsImage
kevyuu Apr 22, 2026
d15d00c
Remove unused CCUDASharedMemory::exportAsBuffer
kevyuu Apr 22, 2026
ea36189
Refactor external memory allocation to store the external handle sepa…
kevyuu Apr 22, 2026
f04dcdb
Remove unused constructor parameter in CCUDASharedSemaphore
kevyuu Apr 22, 2026
cea9d9e
Implement CCUDAImportedMemory
kevyuu Apr 22, 2026
3ea3e9d
Rename CCUDASharedSemaphore into CCUDAImportedSemaphore
kevyuu Apr 22, 2026
130cd1e
Rename CCUDASharedMemory into CCUDAExportableMemory
kevyuu Apr 22, 2026
c624053
Remove unused member in CCUDAExportableMemory
kevyuu Apr 22, 2026
9127faa
Slight rename to CCUDADevice method
kevyuu Apr 22, 2026
059d1d5
Merge with master
kevyuu Apr 22, 2026
ff5a9cd
Merge branch 'master' into vk_cuda_interop
kevyuu Apr 22, 2026
2eb8fee
Add option for _NBL_COMPILE_WITH_CUDA_
kevyuu Apr 22, 2026
6605beb
Revert to correct state before merging with master
kevyuu Apr 23, 2026
af35f4f
Revert "Add option for _NBL_COMPILE_WITH_CUDA_"
kevyuu Apr 23, 2026
2479fb2
Slight fix
kevyuu Apr 23, 2026
f297cc2
Slight fix on linux handle
kevyuu Apr 23, 2026
3df125b
Fix typo
kevyuu Apr 23, 2026
2e2ca3f
Fix CCUDAImportedSemaphore constructor
kevyuu Apr 23, 2026
8c4c91e
Remove unused CCUDASharedSemaphore.cpp
kevyuu Apr 23, 2026
fcec268
Fix handle type for Linux
kevyuu Apr 23, 2026
ac18781
Add missing external handle type and make the constant consistent
kevyuu Apr 23, 2026
d73c851
Slight fix
kevyuu Apr 23, 2026
2c75ed8
Fix indentation and refactor to be more idiomatic
kevyuu Apr 23, 2026
3e905e9
Add some comment
kevyuu Apr 23, 2026
963a3d6
Fix typo
kevyuu Apr 23, 2026
0de37b0
Slight improvement
kevyuu Apr 23, 2026
d50d709
Remove unused variable
kevyuu Apr 23, 2026
763d173
Add include WIN32 include guard
kevyuu Apr 23, 2026
d71e52d
Remove unused class
kevyuu Apr 23, 2026
cfad816
Refactor CCUDADevice api to be more consistent with vulkan device api
kevyuu Apr 23, 2026
b22168e
Refactor constructor parameter naming
kevyuu Apr 23, 2026
5bd64ae
Idiomatic way to create core::smart_refctd_ptr
kevyuu Apr 23, 2026
bd0f8a2
Fix destruction and remove unnecessary SCUDACleaner
kevyuu Apr 23, 2026
6d47b90
CCUDAHandler construction more idiomatic
kevyuu Apr 24, 2026
0257d9a
Refactor magic number
kevyuu Apr 24, 2026
0999994
Remove releasing allocationHandle in destructor, since we already cal…
kevyuu Apr 24, 2026
6f4b889
Input validation and error logging
kevyuu Apr 24, 2026
129ceac
Revert 6605bebf changes in tgmath impl.hlsl
kevyuu Apr 30, 2026
2c08464
Fix indentation in IDeviceMemoryAllocator.h
kevyuu Apr 30, 2026
e993757
Turn off NBL_COMPILE_WITH_CUDA by default
kevyuu Apr 30, 2026
dcf0552
Move CCUDAHandler constructor from protected to public
kevyuu Apr 30, 2026
f6bf989
Fix crash due to dangling win32metadata
kevyuu Apr 30, 2026
0d237c0
Implement vk flag for HOST_NUMA and HOST_NUMA_CURRENT
kevyuu May 4, 2026
f4ce3dc
Move CUDA interop behind extension target
AnastaZIuk May 6, 2026
78845ae
Address CUDA interop review cleanup
AnastaZIuk May 6, 2026
ab9a7e5
Simplify CUDA interop smoke CMake
AnastaZIuk May 6, 2026
bf8eeb3
Clean CUDA interop smoke usage requirements
AnastaZIuk May 6, 2026
f701ac6
Export CUDA interop package target
AnastaZIuk May 6, 2026
a520d57
Use CUDAToolkit package targets
AnastaZIuk May 6, 2026
4bddc57
Require CUDA version via CMake
AnastaZIuk May 6, 2026
6f68e66
Split CUDA interop native surface
AnastaZIuk May 6, 2026
49bcb2c
Add native CUDA accessor overloads
AnastaZIuk May 6, 2026
d85657e
Document CUDA interop target split
AnastaZIuk May 6, 2026
6e8c4f9
Trim CUDA interop README wording
AnastaZIuk May 6, 2026
881e9b8
Move CUDA interop into Nabla
AnastaZIuk May 6, 2026
5dd1134
Document CUDA interop accessor model
AnastaZIuk May 7, 2026
e514df7
Inline CUDA interop stubs
AnastaZIuk May 7, 2026
e53c838
Refine CUDA interop boundary
AnastaZIuk May 7, 2026
1417905
Add CUDA interop runtime header discovery
AnastaZIuk May 7, 2026
045432e
Tighten CUDA interop native helpers
AnastaZIuk May 7, 2026
8a119dd
Hide CUDA interop native state construction
AnastaZIuk May 7, 2026
e018545
Clean up CUDA runtime header discovery
AnastaZIuk May 7, 2026
c6ef6ee
Move CUDA interop API back into video
AnastaZIuk May 7, 2026
d559a2c
Move smart pointer helpers into core
AnastaZIuk May 7, 2026
38705b9
Use CUDA interop accessors
AnastaZIuk May 7, 2026
23e6ef5
Use explicit CUDA compile log
AnastaZIuk May 7, 2026
a640183
Trim CUDA interop API surface
AnastaZIuk May 7, 2026
5bf0e2d
Keep CUDA SDK layouts private
AnastaZIuk May 7, 2026
d745421
Simplify CUDA interop helper
AnastaZIuk May 7, 2026
ffba3d4
Update CUDA interop examples pointer
AnastaZIuk May 7, 2026
745f1b9
Use opaque CUDA interop boundary
AnastaZIuk May 8, 2026
ec259cb
Clean CUDA interop boundary
AnastaZIuk May 9, 2026
fce838b
Polish CUDA interop cleanup
AnastaZIuk May 9, 2026
9f2d5fe
Simplify CUDA interop native boundary
AnastaZIuk May 9, 2026
ed8a1d6
Refine CUDA interop boundary
AnastaZIuk May 10, 2026
f2f62ce
Polish CUDA interop review feedback
AnastaZIuk May 10, 2026
9c504a1
Polish CUDA interop native header
AnastaZIuk May 10, 2026
0df7507
Use opaque CUDA interop handles
AnastaZIuk May 10, 2026
21d3b7c
Accept CUDA handler pointers in assert helper
AnastaZIuk May 10, 2026
dfca17e
Consolidate CUDA native handle declarations
AnastaZIuk May 10, 2026
525315e
Tighten CUDA native output bridges
AnastaZIuk May 10, 2026
d8d4c3b
Centralize CUDA output bridge
AnastaZIuk May 11, 2026
fe3fd66
Document CUDA interop handles
AnastaZIuk May 11, 2026
d5dfade
Make CUDA PTX compile log optional
AnastaZIuk May 11, 2026
2d53e9a
Enable CUDA in Windows CI
AnastaZIuk May 11, 2026
0243ed0
Fix CUDA cache path in CI
AnastaZIuk May 11, 2026
4ea20f7
Seed CUDA cache on Windows 2025
AnastaZIuk May 11, 2026
85fbf7f
Use Choco for CUDA cache seed
AnastaZIuk May 11, 2026
6008285
Update CUDA interop examples pointer
AnastaZIuk May 11, 2026
82d82a2
Update CUDA interop examples pointer
AnastaZIuk May 11, 2026
828211c
Retry CI image pull
AnastaZIuk May 11, 2026
e913518
Deduplicate CUDA CI setup
AnastaZIuk May 12, 2026
f74efe8
Simplify CUDA CI cache handling
AnastaZIuk May 12, 2026
920f2ef
Keep CUDA CI paths configurable
AnastaZIuk May 12, 2026
2744182
Temporarily add include path
kevyuu May 12, 2026
5b336ca
Merge branch 'master' into vk_cuda_interop
kevyuu May 12, 2026
29700f9
Merge branch 'cuInteropBS' into vk_cuda_interop
kevyuu May 12, 2026
f18a7c6
Add final keyword whenever appropriate
kevyuu May 12, 2026
5ed47a8
Add documentation for CCUDAExportableMemory::exportAsMemory
kevyuu May 12, 2026
6af5ab8
Remove external_handle from memory type iterator
kevyuu May 12, 2026
874177d
Rename externalHandle to either exportHandle or importHandle
kevyuu May 12, 2026
b3b3c77
Remove redundant comment
kevyuu May 12, 2026
85fcc1a
Put allocate arguments into SAllocateParams
kevyuu May 12, 2026
1911eb0
Move external_handle_t to its own file and to system namespace
kevyuu May 12, 2026
30e8e3f
Make some enum flag more compact
kevyuu May 13, 2026
d872e3a
Remove unnecessary friendship
kevyuu May 13, 2026
fe2d650
Remove const specifier on SCreationParams member
kevyuu May 13, 2026
8ed5f77
Log failure when closing externalHandle
kevyuu May 13, 2026
0fdaa3c
Create new api for createSemaphore
kevyuu May 13, 2026
4e1a697
Slight refactor
kevyuu May 13, 2026
f7de243
Add more ExternalHandleType
kevyuu May 13, 2026
3932dbe
Add more architecture flag
kevyuu May 14, 2026
02dea2d
Add more virtual architecture option
kevyuu May 14, 2026
395959b
Initial implementation of cached buffer and image properties
kevyuu May 14, 2026
3917859
Revert E_EXTERNAL_HANDLE_TYPE back to uint32_t
kevyuu May 15, 2026
3bff2f7
Merge createSemaphore implementation
kevyuu May 15, 2026
f960afe
Cached physicalDevice in CCUDADevice.cpp
kevyuu May 15, 2026
03b5a30
Add const specifier for SCreationParams
kevyuu May 15, 2026
e7f2c93
Use 8bit literal for 8bit enum
kevyuu May 15, 2026
2bfb534
Add static assertion for SExternalMemoryProperties size
kevyuu May 15, 2026
051cb01
Remove assert in destructor
kevyuu May 15, 2026
6f595ba
Move destructors to private
kevyuu May 15, 2026
3d2b603
Assign AllocatioNHandleType as constexpr
kevyuu May 15, 2026
23a6673
Slight rename for ExternalMemoryHandleType
kevyuu May 15, 2026
d60c2aa
Reorder SImageFormatInfo's member and assert its size
kevyuu May 15, 2026
5afbf9e
Update nvcc compile flags to c++20
kevyuu May 15, 2026
5de3c41
Add validation regarding external handle type when creating semaphore
kevyuu May 15, 2026
19152c1
Refine validation for external createSemaphore
kevyuu May 15, 2026
6f7ae49
Fix bug in createSemaphore regarding externalHandleType validation
kevyuu May 15, 2026
77c0824
Add validation for allocation of external memory. Revert some changes
kevyuu May 16, 2026
b08a019
Add validation for IGPUImage::SCreationParams
kevyuu May 16, 2026
9ff9a95
Add validation for IGPUImage::SCreationParams
kevyuu May 16, 2026
3ace53b
Remove unnecessary include
kevyuu May 16, 2026
4c17cdb
Remove unnecessary compile options for nvcc
kevyuu May 16, 2026
4401405
Merge branch 'vk_cuda_interop' of https://github.com/Devsh-Graphics-P…
kevyuu May 16, 2026
6c758c8
Resolve merge conflict
kevyuu May 16, 2026
f8dbc40
Assert false to all stub constructors
kevyuu May 18, 2026
428c09a
Add allocateFlags parameter to exportAsMemory
kevyuu May 18, 2026
462351f
Merge branch 'master' into vk_cuda_interop
kevyuu May 21, 2026
fe21b9a
Fix passing parameter when calling allocate
kevyuu May 22, 2026
63f1907
Add isValidExternalHandleType to each shareable resource class
kevyuu May 22, 2026
2634e75
Add size and offset parameter to CCUDAImportedMemory::getMappedBuffer
kevyuu May 22, 2026
8eb1495
Implement conversion of cpu buffer and image that can be imported by …
kevyuu May 22, 2026
5bba75d
Implement bitCount for core::bitflag
kevyuu May 22, 2026
687376f
Implement multiple external handles for shared semaphore
kevyuu May 22, 2026
c185122
Implement multiple external handle for each mem allocation
kevyuu May 23, 2026
5140daa
Improve validation in importExternalSemaphore
kevyuu May 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion 3rdparty/jitify
Submodule jitify updated 5 files
+10 −5 Makefile
+137 −65 jitify.hpp
+72 −0 jitify_test.cu
+586 −0 nvrtc_cli.cpp
+58 −0 nvrtc_cli_test.sh
6 changes: 3 additions & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,10 @@ option(NBL_COMPILE_WITH_CUDA "Compile with CUDA interop?" OFF)

if(NBL_COMPILE_WITH_CUDA)
find_package(CUDAToolkit REQUIRED)
if(${CUDAToolkit_VERSION} VERSION_GREATER "9.0")
message(STATUS "CUDA version 9.0+ found!")
if(${CUDAToolkit_VERSION} VERSION_GREATER_EQUAL "13.0")
message(STATUS "CUDA version ${CUDAToolkit_VERSION} found!")
else()
message(FATAL_ERROR "CUDA version 9.0+ needed for C++14 support!")
message(FATAL_ERROR "CUDA version 13.0+ needed for C++14 support!")
endif()
endif()

Expand Down
2 changes: 1 addition & 1 deletion examples_tests
Submodule examples_tests updated 161 files
2 changes: 2 additions & 0 deletions include/nbl/asset/IBuffer.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ class IBuffer : public IDescriptor, public core::IBuffer
//! synthetic Nabla inventions
// whether `IGPUCommandBuffer::updateBuffer` can be used on this buffer
EUF_INLINE_UPDATE_VIA_CMDBUF = 0x80000000u,

EUF_SYNTHETIC_FLAGS_MASK = EUF_INLINE_UPDATE_VIA_CMDBUF | 0 /* fill out as needed if anymore synthethic flags are added*/
};

//!
Expand Down
155 changes: 39 additions & 116 deletions include/nbl/video/CCUDADevice.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@


#include "nbl/video/IPhysicalDevice.h"
#include "nbl/video/CCUDAExportableMemory.h"
#include "nbl/video/CCUDAImportedMemory.h"
#include "nbl/video/CCUDAImportedSemaphore.h"


#ifdef _NBL_COMPILE_WITH_CUDA_
Expand All @@ -24,9 +27,17 @@ namespace nbl::video
{
class CCUDAHandler;

class CCUDADevice : public core::IReferenceCounted
class NBL_API2 CCUDADevice : public core::IReferenceCounted
{
public:
public:
#ifdef _WIN32
static constexpr IDeviceMemoryAllocation::E_EXTERNAL_HANDLE_TYPE EXTERNAL_MEMORY_HANDLE_TYPE = IDeviceMemoryAllocation::EHT_OPAQUE_WIN32;
static constexpr CUmemAllocationHandleType ALLOCATION_HANDLE_TYPE = CU_MEM_HANDLE_TYPE_WIN32;
#else
static constexpr IDeviceMemoryAllocation::E_EXTERNAL_HANDLE_TYPE EXTERNAL_MEMORY_HANDLE_TYPE = IDeviceMemoryAllocation::EHT_OPAQUE_FD;
static constexpr CUmemAllocationHandleType ALLOCATION_HANDLE_TYPE = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR;
#endif

enum E_VIRTUAL_ARCHITECTURE
{
EVA_30,
Expand Down Expand Up @@ -63,132 +74,44 @@ class CCUDADevice : public core::IReferenceCounted
};
inline E_VIRTUAL_ARCHITECTURE getVirtualArchitecture() {return m_virtualArchitecture;}

CCUDADevice(core::smart_refctd_ptr<CVulkanConnection>&& vulkanConnection, IPhysicalDevice* const vulkanDevice, const E_VIRTUAL_ARCHITECTURE virtualArchitecture, CUdevice device, core::smart_refctd_ptr<CCUDAHandler>&& handler);

~CCUDADevice();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. if its Refcounted it the destructor shall always be protected and not public
  2. it makes no sense for the constructor to be public since it seems like the only way to create the CCUDADevice object is through a factory of some sort

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the destructor is still public.....

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

inline core::SRange<const char* const> geDefaultCompileOptions() const
{
return {m_defaultCompileOptions.data(),m_defaultCompileOptions.data()+m_defaultCompileOptions.size()};
}

// TODO/REDO Vulkan: https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__EXTRES__INTEROP.html
// https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#vulkan-interoperability
// Watch out, use Driver API (`cu` functions) NOT the Runtime API (`cuda` functions)
// Also maybe separate this out into its own `CCUDA` class instead of nesting it here?
#if 0
template<typename ObjType>
struct GraphicsAPIObjLink
{
GraphicsAPIObjLink() : obj(nullptr), cudaHandle(nullptr), acquired(false)
{
asImage = {nullptr};
}
GraphicsAPIObjLink(core::smart_refctd_ptr<ObjType>&& _obj) : GraphicsAPIObjLink()
{
obj = std::move(_obj);
}
GraphicsAPIObjLink(GraphicsAPIObjLink&& other) : GraphicsAPIObjLink()
{
operator=(std::move(other));
}

GraphicsAPIObjLink(const GraphicsAPIObjLink& other) = delete;
GraphicsAPIObjLink& operator=(const GraphicsAPIObjLink& other) = delete;
GraphicsAPIObjLink& operator=(GraphicsAPIObjLink&& other)
{
std::swap(obj,other.obj);
std::swap(cudaHandle,other.cudaHandle);
std::swap(acquired,other.acquired);
std::swap(asImage,other.asImage);
return *this;
}

~GraphicsAPIObjLink()
{
assert(!acquired); // you've fucked up, there's no way for us to fix it, you need to release the objects on a proper stream
if (obj)
CCUDAHandler::cuda.pcuGraphicsUnregisterResource(cudaHandle);
}

//
auto* getObject() const {return obj.get();}

private:
core::smart_refctd_ptr<ObjType> obj;
CUgraphicsResource cudaHandle;
bool acquired;

friend class CCUDAHandler;
public:
union
{
struct
{
CUdeviceptr pointer;
} asBuffer;
struct
{
CUmipmappedArray mipmappedArray;
CUarray array;
} asImage;
};
};
CUdevice getInternalObject() const { return m_handle; }
Comment thread
devshgraphicsprogramming marked this conversation as resolved.
Outdated

//
static CUresult registerBuffer(GraphicsAPIObjLink<video::IGPUBuffer>* link, uint32_t flags = CU_GRAPHICS_REGISTER_FLAGS_NONE);
static CUresult registerImage(GraphicsAPIObjLink<video::IGPUImage>* link, uint32_t flags = CU_GRAPHICS_REGISTER_FLAGS_NONE);

const CCUDAHandler* getHandler() const { return m_handler.get(); }

template<typename ObjType>
static CUresult acquireResourcesFromGraphics(void* tmpStorage, GraphicsAPIObjLink<ObjType>* linksBegin, GraphicsAPIObjLink<ObjType>* linksEnd, CUstream stream)
{
auto count = std::distance(linksBegin,linksEnd);

auto resources = reinterpret_cast<CUgraphicsResource*>(tmpStorage);
auto rit = resources;
for (auto iit=linksBegin; iit!=linksEnd; iit++,rit++)
{
if (iit->acquired)
return CUDA_ERROR_UNKNOWN;
*rit = iit->cudaHandle;
}

auto retval = cuda.pcuGraphicsMapResources(count,resources,stream);
for (auto iit=linksBegin; iit!=linksEnd; iit++)
iit->acquired = true;
return retval;
}
template<typename ObjType>
static CUresult releaseResourcesToGraphics(void* tmpStorage, GraphicsAPIObjLink<ObjType>* linksBegin, GraphicsAPIObjLink<ObjType>* linksEnd, CUstream stream)
{
auto count = std::distance(linksBegin,linksEnd);

auto resources = reinterpret_cast<CUgraphicsResource*>(tmpStorage);
auto rit = resources;
for (auto iit=linksBegin; iit!=linksEnd; iit++,rit++)
{
if (!iit->acquired)
return CUDA_ERROR_UNKNOWN;
*rit = iit->cudaHandle;
}

auto retval = cuda.pcuGraphicsUnmapResources(count,resources,stream);
for (auto iit=linksBegin; iit!=linksEnd; iit++)
iit->acquired = false;
return retval;
}
bool isMatchingDevice(const IPhysicalDevice* device) { return device && !memcmp(device->getProperties().deviceUUID, m_physicalDevice->getProperties().deviceUUID, 16); }

static CUresult acquireAndGetPointers(GraphicsAPIObjLink<video::IGPUBuffer>* linksBegin, GraphicsAPIObjLink<video::IGPUBuffer>* linksEnd, CUstream stream, size_t* outbufferSizes = nullptr);
static CUresult acquireAndGetMipmappedArray(GraphicsAPIObjLink<video::IGPUImage>* linksBegin, GraphicsAPIObjLink<video::IGPUImage>* linksEnd, CUstream stream);
static CUresult acquireAndGetArray(GraphicsAPIObjLink<video::IGPUImage>* linksBegin, GraphicsAPIObjLink<video::IGPUImage>* linksEnd, uint32_t* arrayIndices, uint32_t* mipLevels, CUstream stream);
#endif
size_t roundToGranularity(CUmemLocationType location, size_t size) const;
Comment thread
devshgraphicsprogramming marked this conversation as resolved.
Outdated

core::smart_refctd_ptr<CCUDAExportableMemory> createExportableMemory(CCUDAExportableMemory::SCreationParams&& inParams);

core::smart_refctd_ptr<CCUDAImportedMemory> importExternalMemory(core::smart_refctd_ptr<IDeviceMemoryAllocation>&& mem);

protected:
friend class CCUDAHandler;
CCUDADevice(core::smart_refctd_ptr<CVulkanConnection>&& _vulkanConnection, IPhysicalDevice* const _vulkanDevice, const E_VIRTUAL_ARCHITECTURE _virtualArchitecture);
~CCUDADevice() = default;

core::smart_refctd_ptr<CCUDAImportedSemaphore> importExternalSemaphore(core::smart_refctd_ptr<ISemaphore>&& sem);

private:
CUresult reserveAddressAndMapMemory(CUdeviceptr* outPtr, size_t size, size_t alignment, CUmemLocationType location, CUmemGenericAllocationHandle memory) const;

static constexpr auto CudaMemoryLocationCount = 5;

const system::logger_opt_ptr m_logger;
std::vector<const char*> m_defaultCompileOptions;
core::smart_refctd_ptr<CVulkanConnection> m_vulkanConnection;
IPhysicalDevice* const m_vulkanDevice;
IPhysicalDevice* const m_physicalDevice;
E_VIRTUAL_ARCHITECTURE m_virtualArchitecture;

core::smart_refctd_ptr<CCUDAHandler> m_handler;
CUdevice m_handle;
CUcontext m_context;
std::array<size_t, CudaMemoryLocationCount> m_allocationGranularity;
};

}
Expand Down
65 changes: 65 additions & 0 deletions include/nbl/video/CCUDAExportableMemory.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
// Copyright (C) 2018-2020 - DevSH Graphics Programming Sp. z O.O.
// This file is part of the "Nabla Engine".
// For conditions of distribution and use, see copyright notice in nabla.h
#ifndef _NBL_VIDEO_C_CUDA_EXPORTABLE_MEMORY_H_
#define _NBL_VIDEO_C_CUDA_EXPORTABLE_MEMORY_H_


#ifdef _NBL_COMPILE_WITH_CUDA_

#include "cuda.h"
#include "nvrtc.h"
#if CUDA_VERSION < 9000
#error "Need CUDA 9.0 SDK or higher."
#endif

// useful includes in the future
//#include "cudaEGL.h"
//#include "cudaVDPAU.h"

namespace nbl::video
{

class CCUDADevice;

class NBL_API2 CCUDAExportableMemory : public core::IReferenceCounted
{
public:

struct SCreationParams
{
size_t size;
uint32_t alignment;
CUmemLocationType location;
};

struct SCachedCreationParams : SCreationParams
{
size_t granularSize;
CUdeviceptr ptr;
external_handle_t externalHandle;
};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get your split between the two structs, CreationPArams usually inherits from cached, because the idea is that Creation Params contains extra data, and Cached is the subset you want to keep.

In your case everything is cached creation params, so the creation params should inherit and be empty or just be an alias


CCUDAExportableMemory(core::smart_refctd_ptr<CCUDADevice> device, SCachedCreationParams&& params)
: m_device(std::move(device))
, m_params(std::move(params))
{}
~CCUDAExportableMemory() override;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public constructors for an object which should only be creatable via a factory

public destructor for IReferenceCounted


CUdeviceptr getDeviceptr() const { return m_params.ptr; }

const SCreationParams& getCreationParams() const { return m_params; }

core::smart_refctd_ptr<IDeviceMemoryAllocation> exportAsMemory(ILogicalDevice* device, IDeviceMemoryBacked* dedication = nullptr) const;
Copy link
Copy Markdown
Member

@devshgraphicsprogramming devshgraphicsprogramming May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation, I find this confusing, esp the dedication parameter


private:

core::smart_refctd_ptr<CCUDADevice> m_device;
SCachedCreationParams m_params;
};

}

#endif // _NBL_COMPILE_WITH_CUDA_

#endif
Loading
Loading