Adding Jérôme to Cc. I think he looked the userptr code before, so maybe he has some idea what is going wrong here. I also had a look at the code, but my knowledge about the DMA API is almost nonexistant. However I can see that before doing any DMA via the G2D on the buffer the code calls dma_map() on it, and also unmaps it when the commandlist is finished. With best wishes, Tobias Tobias Jakobi wrote: > Thanks Lucas for the explanation! > > > Lucas Stach wrote: >> Hi Tobias, >> >> Am Sonntag, den 16.08.2015, 14:48 +0200 schrieb Tobias Jakobi: >>> Hello, >>> >>> some time ago I checked whether I could use the userptr functionality to >>> do zero-copy from userspace allocated buffers via the G2D. This didn't >>> work out so well, so kinda put this to the bottom of my TODO list. >>> >>> Now that IOMMU support has landed and Jan Kara has rewrote page pinning >>> using frame vectors (see [1]) I gave userptr another try. >>> >>> The results are much better. I'm not experiencing any kernel lockups or >>> sysmmu pagefaults anymore. However the image now suffers from visual >>> artifacts. These images show the nature of the artifacts: >>> http://i.imgur.com/nzT6g3Y.jpg >>> http://i.imgur.com/wkuYI6X.jpg >>> >>> The corruption always manifests itself in these pixel lines of fixed >>> size and wrong color. >>> >>> I have written a testcase as part of libdrm for this issue: >>> https://github.com/tobiasjakobi/libdrm/commit/db8bf6844436598251f67a71fc334b929bfb2b71 >>> >>> It allocates N (N an even number) buffers which are aligned to the >>> system pagesize. Then it does this each iteration: >>> 1) Fill the first N/2 buffers with random data >>> 2) Copy the first half to the second half of the buffers >>> 3) memcmp() first and second half (verification pass) >>> >>> Usually this verification already fails on the first iteration. An >>> interesting observation is that increasing (!) the buffer size (so the >>> amount of pixels that have to copied per buffer grows) makes this issue >>> less likely to happen. >>> >>> With the default 512x512 buffers however it happens, like I said above, >>> almost immediately. >>> >> This is obviously a cache flush missing. The memory you get from >> userspace is normal cached memory, so to make it visible to the GPU you >> need to flush parts of the cache out to main memory. >> >> The corruption you are seeing is just unflushed cachelines. This also >> explains why increasing the buffer size helps: the more memory the CPU >> touches the more cachelines will be flushed out to be replaced with new >> data. > I should point out that the snapshots I uploaded were done with a > different setup. There only the source memory of the G2D operation is a > userspace allocated buffer. The destination is a GEM buffer allocated > through libdrm, which is then used as framebuffer. So the issue already > appears when just the source is userspace allocated. > > What works however is an operation between GEM to GEM. However this > might be related to the default allocation flags libdrm uses. > > > >> So you need to go and have a look at dma_map() and dma_sync_*_for_*() >> and friends. >> >> Regards, >> Lucas >> > > > With best wishes, > Tobias > > -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html