Re: Enabling peer to peer device transactions for PCIe devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dan,

I personally like "device-DAX" idea but my concerns are:

-  How well it will co-exists with the  DRM infrastructure / implementations
   in part dealing with CPU pointers?

-  How well we will be able to handle case when we need to "move"/"evict"
   memory/data to the new location so CPU pointer should point to the new physical location/address
    (and may be not in PCI device memory at all)?


Sincerely yours,

Serguei Sagalovitch


On 2016-11-22 01:11 PM, Dan Williams wrote:
On Mon, Nov 21, 2016 at 12:36 PM, Deucher, Alexander
<Alexander.Deucher@xxxxxxx> wrote:
This is certainly not the first time this has been brought up, but I'd like to try and get some consensus on the best way to move this forward.  Allowing devices to talk directly improves performance and reduces latency by avoiding the use of staging buffers in system memory.  Also in cases where both devices are behind a switch, it avoids the CPU entirely.  Most current APIs (DirectGMA, PeerDirect, CUDA, HSA) that deal with this are pointer based.  Ideally we'd be able to take a CPU virtual address and be able to get to a physical address taking into account IOMMUs, etc.  Having struct pages for the memory would allow it to work more generally and wouldn't require as much explicit support in drivers that wanted to use it.

Some use cases:
1. Storage devices streaming directly to GPU device memory
2. GPU device memory to GPU device memory streaming
3. DVB/V4L/SDI devices streaming directly to GPU device memory
4. DVB/V4L/SDI devices streaming directly to storage devices

Here is a relatively simple example of how this could work for testing.  This is obviously not a complete solution.
- Device memory will be registered with Linux memory sub-system by created corresponding struct page structures for device memory
- get_user_pages_fast() will  return corresponding struct pages when CPU address points to the device memory
- put_page() will deal with struct pages for device memory

[..]
4. iopmem
iopmem : A block device for PCIe memory (https://lwn.net/Articles/703895/)
The change I suggest for this particular approach is to switch to
"device-DAX" [1]. I.e. a character device for establishing DAX
mappings rather than a block device plus a DAX filesystem. The pro of
this approach is standard user pointers and struct pages rather than a
new construct. The con is that this is done via an interface separate
from the existing gpu and storage device. For example it would require
a /dev/dax instance alongside a /dev/nvme interface, but I don't see
that as a significant blocking concern.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2016-October/007496.html

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux