Re: Use of pci_map_page in nouveau, radeon TTM.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/01/2013 12:34 PM, Lucas Stach wrote:
Am Dienstag, den 01.10.2013, 12:16 +0200 schrieb Thomas Hellstrom:
Jerome, Konrad

Forgive an ignorant question, but it appears like both Nouveau and
Radeon may use pci_map_page() when populating TTMs on
pages obtained using the ordinary (not DMA pool). These pages will, if I
understand things correctly, not be pages allocated with
DMA_ALLOC_COHERENT.

  From what I understand, at least for the corresponding dma_map_page()
it's illegal for the CPU to access these pages without calling
dma_sync_xx_for_cpu(). And before the device is allowed to access them
again, you need to call dma_sync_xx_for_device().
So mapping for PCI really invalidates the TTM interleaved CPU / device
access model.

That's right. The API says you need to sync for device or cpu, but on
x86 you can get away with not doing so, as on x86 the calls end up just
being WB buffer flushes.

OK, but what about the cases where the dma subsystem allocates a bounce buffer?
(Although I think the TTM page selection works around this situation).
Perhaps at the very least this deserves a comment in the code...

For ARM, or similar non-coherent arches you absolutely have to do the
syncs, or you'll end up with different contents in cache vs sysram. For
my nouveau on ARM work I introduced some simple helpers to do the right
thing. And it really isn't hard doing the syncs at the right points in
time, just sync for CPU when getting a cpu_prep ioctl and then sync for
device when validating a buffer for GPU use.

Yes, this will probably work for drivers where a buffer is either bound for CPU or for GPU, however, on drivers using user-space sub-allocation of buffers, or for partial updates of vertex buffers etc. that isn't sufficient. In that case one either has to use coherent memory or implement an elaborate scheme where we sync for device and kill user-space mappings on validation and sync for cpu in the cpu fault handler. Unfortunately the latter triggers a fence wait for the
whole buffer, not just the part of the buffer we want to write to.

Regards,
Lucas

Regards,
Thomas
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux