Skip to content

Add Python bindings for SYCL IPC memory via dpctl#2331

Open
zxue2 wants to merge 2 commits into
IntelPython:masterfrom
zxue2:enable_ipc_mem
Open

Add Python bindings for SYCL IPC memory via dpctl#2331
zxue2 wants to merge 2 commits into
IntelPython:masterfrom
zxue2:enable_ipc_mem

Conversation

@zxue2

@zxue2 zxue2 commented Jun 30, 2026

Copy link
Copy Markdown

Add inter-process communication (IPC) support for SYCL USM memory, enabling zero-copy GPU memory sharing across processes. It wraps sycl::ext::oneapi::experimental::ipc::memory (get/open/close/put).

C API (libsyclinterface):

  • dpctl_sycl_ipc_memory_interface.h: declares DPCTLIPCMem_GetHandle, DPCTLIPCMem_OpenHandle, DPCTLIPCMem_CloseHandle, DPCTLIPCMem_FreeHandleData
  • dpctl_sycl_ipc_memory_interface.cpp: implements the C API by calling the SYCL experimental ipc::memory functions; auto-discovered by CMake.

Cython declarations:

  • _backend.pxd: extern block for the 4 new C functions

Python subpackage (dpctl.ipc):

  • IPCMemoryHandle: Cython extension class wrapping IPC memory export/import init(usm_memory): calls get() + put(), stores handle as bytes to_bytes(): serializable payload for cross-process transport open(handle_bytes, device, nbytes): returns MemoryUSMDevice close_mapping(usm_memory): explicitly closes an IPC mapping

Test In Docker (oneAPI 2026.0 installed):

python -c "
import dpctl
from dpctl.memory import MemoryUSMDevice
from dpctl.ipc import IPCMemoryHandle
d = dpctl.SyclDevice('level_zero:gpu')
q = dpctl.SyclQueue(d)
mem = MemoryUSMDevice(4096, queue=q)
h = IPCMemoryHandle(mem)
raw = h.to_bytes()
print(f'IPC handle: {len(raw)} bytes')
mem2 = IPCMemoryHandle.open(raw, d, nbytes=4096)
print(f'Opened: {mem2.nbytes} bytes on {mem2.sycl_device.name}')
IPCMemoryHandle.close_mapping(mem2)
print('close OK')
"

IPC handle: 120 bytes
Opened: 4096 bytes on Intel(R) Arc(TM) Pro B60 Graphics
close OK

Add inter-process communication (IPC) support for SYCL USM memory, enabling
zero-copy GPU memory sharing across processes. It wraps
sycl::ext::oneapi::experimental::ipc::memory (get/open/close/put).

C API (libsyclinterface):
  - dpctl_sycl_ipc_memory_interface.h: declares DPCTLIPCMem_GetHandle,
    DPCTLIPCMem_OpenHandle, DPCTLIPCMem_CloseHandle, DPCTLIPCMem_FreeHandleData
  - dpctl_sycl_ipc_memory_interface.cpp: implements the C API by calling the
    SYCL experimental ipc::memory functions; auto-discovered by CMake.

Cython declarations:
  - _backend.pxd: extern block for the 4 new C functions

Python subpackage (dpctl.ipc):
  - IPCMemoryHandle: Cython extension class wrapping IPC memory export/import
    __init__(usm_memory): calls get() + put(), stores handle as bytes
    to_bytes(): serializable payload for cross-process transport
    open(handle_bytes, device, nbytes): returns MemoryUSMDevice
    close_mapping(usm_memory): explicitly closes an IPC mapping

Signed-off-by: Zhan Xue <zhan.xue@intel.com>
Comment thread dpctl/ipc/_ipc_memory.pyx
cdef class IPCMemoryHandle:
"""Wrapper around a SYCL IPC memory handle.

Instances are created by passing a :class:`dpctl.memory.MemoryUSMDevice`

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this docstring is accurate and the extension only works with MemoryUSMDevice, then shouldn't we add isinstance check below to guarantee it's device memory?

Is it a requirement that it be a USM device allocation, or can host and/or shared work?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urIPCGetMemHandleExp without USM type restriction but urIPCOpenMemHandleExp is hardcoded to UR_USM_TYPE_DEVICE.
So urIPCOpenMemHandleExp may keep it safe. Please kindly double confirm it from UR.

btw, zeMemGetIpcHandle with device memory allocation and zeMemGetIpcHandleWithProperties with device or host memory allocation. I suppose there may be some extension in UR in future.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is the case, then let's add the check for isinstance(memory, USMDeviceMemory)

Comment thread dpctl/ipc/_ipc_memory.pyx
Comment thread dpctl/ipc/_ipc_memory.pyx
Comment on lines +237 to +239
usm_memory._opaque_ptr = NULL
usm_memory._memory_ptr = NULL
usm_memory.nbytes = 0

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems a bit dangerous for users

I think it would make the most sense to instead, since _opaque_ptr is std::shared_ptr, share the shared_ptr with the memory object shared pointer if possible

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you pls make it as follow-up improvement with separate PR given there should be no obvious risk for current use case: deterministic ipc open/get/close, explicit-lifetime IPC resource?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will instead open a PR into this branch which fixes this issue, I would prefer it be fixed before we merge anything

Comment thread dpctl/ipc/_ipc_memory.pyx
Comment thread dpctl/ipc/_ipc_memory.pxd
# distutils: language = c++
# cython: language_level=3

"""Declarations for the IPCMemoryHandle Cython extension type."""

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm conflicted on whether this belongs in a separate namespace or not. So far, we've been free with incorporating extensions into the main dpctl namespace, as long as we have fall-backs so that they don't crash, etc. when the extension is unavailable. This will enable us to use AdaptiveCpp in the future.

Having an entire submodule not be built is a bit different. Perhaps instead we can move this into dpctl.memory and implement fallbacks in the C-API and an is_available free function like we have for WorkGroupMemory, etc. to check for it

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect dpctl.ipc namespace to hold both memory ipc and event ipc . What about revisiting it post event ipc coming?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand it would hold both functionality, but I think it is still more sensible to incorporate it with existing modules as opposed to adding this new very specific submodule.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is to say, if we move current IPC implementation into dpctl.memory and add fallbacks for if it isn't present, it would be fine as is

@zxue2 zxue2 force-pushed the enable_ipc_mem branch from 352d2f8 to 3dea379 Compare July 1, 2026 13:12
add has_aspect_ext_oneapi_ipc_memory checks in __init__ and open()

Signed-off-by: Zhan Xue <zhan.xue@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants