On Wed, Aug 26, 2015 at 02:28:30PM +0200, Thomas Hellstrom wrote: > On 08/26/2015 02:10 PM, Daniel Vetter wrote: > > On Wed, Aug 26, 2015 at 08:49:00AM +0200, Thomas Hellstrom wrote: > >> Hi, Tiago. > >> > >> On 08/26/2015 02:02 AM, Tiago Vignatti wrote: > >>> From: Daniel Vetter <daniel.vetter@xxxxxxxx> > >>> > >>> The userspace might need some sort of cache coherency management e.g. when CPU > >>> and GPU domains are being accessed through dma-buf at the same time. To > >>> circumvent this problem there are begin/end coherency markers, that forward > >>> directly to existing dma-buf device drivers vfunc hooks. Userspace can make use > >>> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be > >>> used like following: > >>> > >>> - mmap dma-buf fd > >>> - for each drawing/upload cycle in CPU > >>> 1. SYNC_START ioctl > >>> 2. read/write to mmap area or a 2d sub-region of it > >>> 3. SYNC_END ioctl. > >>> - munamp once you don't need the buffer any more > >>> > >>> v2 (Tiago): Fix header file type names (u64 -> __u64) > >>> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end > >>> dma-buf functions. Check for overflows in start/length. > >>> v4 (Tiago): use 2d regions for sync. > >> Daniel V had issues with the sync argument proposed by Daniel S. I'm > >> fine with that argument or an argument containing only a single sync > >> rect. I'm not sure whether Daniel V will find it easier to accept only a > >> single sync rect... > > I'm kinda against all the 2d rect sync proposals ;-) At least for the > > current stuff it's all about linear subranges afaik, and even there we > > don't bother with flushing them precisely right now. > > > > My expectation would be that if you _really_ want to etch out that last > > bit of performance with a list of 2d sync ranges then probably you want to > > do the cpu cache flushing in userspace anyway, with 100% machine-specific > > trickery. > > Daniel, > > I might be misunderstanding things, but isn't this about finally > accepting a dma-buf mmap() generic interface for people who want to use > it for zero-copy applications (like people have been trying to do for > years but never bothered to specify an interface that actually performed > on incoherent hardware)? > > If it's only about exposing the kernel 1D sync interface to user-space > for correctness, then why isn't that done transparently to the user? Mostly pragmatic reasons - we could do the page-fault trickery, but that means i915 needs another mmap implementation. At least I didn't figure out how to do faulting in a completely generic way. And we already have 3 other mmap implementations so I prefer not to do that. The other is that right now there's no user nor implementation in sight which actually does range-based flush optimizations, so I'm pretty much expecting we'll get it wrong. Maybe instead we should go one step further and remove the range from the internal dma-buf interface and also drop it from the ioctl? With the flags we can always add something later on once we have a real user with a clear need for it. But afaik cros only wants to shuffle around entire tiles and has a buffer-per-tile approach. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx