On 8/24/21 2:32 AM, Christian König wrote:
Am 24.08.21 um 11:06 schrieb Gal Pressman:
On 23/08/2021 13:43, Christian König wrote:
Am 21.08.21 um 11:16 schrieb Gal Pressman:
On 20/08/2021 17:32, Jason Gunthorpe wrote:
On Fri, Aug 20, 2021 at 03:58:33PM +0300, Gal Pressman wrote:
...
IIUC, we're talking about three different exporter "types":
- Dynamic with move_notify (requires ODP)
- Dynamic with revoke_notify
- Static
Which changes do we need to make the third one work?
Basically none at all in the framework.
You just need to properly use the dma_buf_pin() function when you start using a
buffer (e.g. before you create an attachment) and the dma_buf_unpin() function
after you are done with the DMA-buf.
I replied to your previous mail, but I'll ask again.
Doesn't the pin operation migrate the memory to host memory?
Sorry missed your previous reply.
And yes at least for the amdgpu driver we migrate the memory to host memory as soon as it is pinned
and I would expect that other GPU drivers do something similar.
Well...for many topologies, migrating to host memory will result in a
dramatically slower p2p setup. For that reason, some GPU drivers may
want to allow pinning of video memory in some situations.
Ideally, you've got modern ODP devices and you don't even need to pin.
But if not, and you still hope to do high performance p2p between a GPU
and a non-ODP Infiniband device, then you would need to leave the pinned
memory in vidmem.
So I think we don't want to rule out that behavior, right? Or is the
thinking more like, "you're lucky that this old non-ODP setup works at
all, and we'll make it work by routing through host/cpu memory, but it
will be slow"?
thanks,
--
John Hubbard
NVIDIA
This is intentional since we don't want any P2P to video memory with pinned objects and want to
avoid to run into a situation where one device is doing P2P to video memory while another device
needs the DMA-buf in host memory.
You can still do P2P with pinned object, it's just up to the exporting driver if it is allowed or not.
The other option is what Daniel suggested that we have some kind of revoke. This is essentially what
our KFD is doing as well when doing interop with 3D GFX, but from Jasons responses I have a bit of
doubt that this will actually work on the hardware level for RDMA.
Regards,
Christian.