On Thu, Jul 4, 2019 at 11:26 AM Steven Price <steven.price@xxxxxxx> wrote: > > On 03/07/2019 17:18, Daniel Vetter wrote: > > On Wed, Jul 3, 2019 at 6:11 PM Steven Price <steven.price@xxxxxxx> wrote: > [...] > >> In theory the exporter should do whatever is required to ensure that the > >> CPU is synchronised when a user space mapping exists. There are some > >> issues here though: > >> > >> * In theory the kernel driver should map the dma_buf purely for the > >> duration that a job is using the buffer (and unmap immediately after). > >> This gives the exporter the knowledge of when the GPU is using the > >> memory and allows the exporter to page out of the memory if necessary. > >> In practise this map/unmap operation is expensive (updating the GPU's > >> page tables) so most drivers don't actually bother and keep the memory > >> mapped. This means the exporter cannot tell when the buffer is used or > >> move the pages. > > > > Yeah refaulting is too slow if you care the least about performance. > > > >> * The CPU mappings can be faulted on demand (performing the necessary > >> CPU cache invalidate if needed) and shot-down to allow moving the > >> memory. In theory when the GPU needs the memory it should map the buffer > >> and the exporter can then shoot down the mappings, perform the CPU cache > >> clean and then allow the GPU to use the memory. A subsequent CPU access > >> would then refault the page, ensuring a CPU cache invalidate so the > >> latest data is visible. > > > > We thought that was the answer, until it was clear its not. dma-buf > > mmap isn't coherent, you need to call the begin/end ioctls. > > > >> * The majority of exporters are simple and deal with uncached memory > >> (e.g. frame buffers) or are actually exporting back to the same driver > >> (e.g. window surfaces). In these situations either the driver already > >> has the necessary "magic" to deal with caches (e.g. kbase provides > >> explicit cache maintenance operations), or it's "uncached" anyway so it > >> doesn't matter. This means that hardly anyone tests with the complex > >> cases... > >> > >> From a user space ABI, my understanding is that a dma_buf mmap() mapping > >> should be coherent, and user space isn't expected to do anything to make > >> it work. Obviously any importing device might have it's own coherency > >> details which will be up to the ABI of that device (e.g. Mali has caches > >> which may need to be flushed - this is usually done at the start/end of > >> a job chain, so coherency is not guaranteed while the job chain is running). > > > > See my other reply, but this isn't correct. dma-buf has explicit cache > > maintenance ops. It's just that generally everyone (ok, display only > > drivers using the cma helpers) ends up exporting coherent memory and > > that's why this works. Doesn't make it a bright idea imo ... > > Sorry, I'd completely forgotten about the DMA_BUF_IOCTL_SYNC ioctl when > I wrote this. But to a large extent this ship has already sailed, and > indeed the current users of dma_buf_mmap() implicitly assume that no > sync is necessary (since there's no mechanism to forward the syncs onto > the exporter). Indeed the comment in dma-buf.c says: > > * The assumption in the current dma-buf interfaces is that redirecting the > * initial mmap is all that's needed. A survey of some of the existing > * subsystems shows that no driver seems to do any nefarious thing like > * syncing up with outstanding asynchronous processing on the device or > * allocating special resources at fault time. So hopefully this is good > * enough, since adding interfaces to intercept pagefaults and allow pte > * shootdowns would increase the complexity quite a bit. > > I'd be happy to kill off dma_buf_mmap() and require user space to mmap > via the dma_buf handle (and also therefore to handle the manual sync > ioctls), but I'm not sure how we achieve that while maintaining > backwards compatibility. Uh I forgot to fully update this in my doc update in 2016. That text originally predates the sync ioctl. Care to type a patch? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel