On Mon, Aug 24, 2015 at 04:52:13PM +0100, Daniel Stone wrote: > Hi Thomas, > > On 24 August 2015 at 10:50, Thomas Hellstrom <thomas@xxxxxxxxxxxx> wrote: > > In any case, Ideally I'd want the struct dma_buf_sync look something like > > > > enum dma_buf_sync_flags { > > DMA_BUF_SYNC_READ = (1 << 0), > > DMA_BUF_SYNC_WRITE = (2 << 0), > > DMA_BUF_SYNC_RW = (3 << 0), > > DMA_BUF_SYNC_START = (0 << 2), > > DMA_BUF_SYNC_END = (1 << 2), > > > > DMA_BUF_SYNC_VALID_FLAGS_MASK = DMA_BUF_SYNC_RW | > > DMA_BUF_SYNC_END > > }; > > > > > > /* begin/end dma-buf functions used for userspace mmap. */ > > struct dma_buf_sync { > > enum dma_buf_sync_flags flags; > > __u64 x; > > __u64 width; > > > > /* 2D Sync parameters */ > > __u64 stride; > > __u64 y; > > __u64 height; > > > > } > > > > > > So again, why use a 2D sync in the first place when it's probably possible to batch 1D syncs in the driver? > > It all boils down to the number of syscalls required, and also the fact that it's probably not trivial to batch sync calls in the read-write case. One can, again, argue that there might be 3D cases or other obscure sync cases, > > but in that case, let's implement them in the future if the overhead becomes unbearable. If the sync mechanism gets established as > > mandatory, the problem will solve itself. And at least now we cover the most common case and we don't need to add a new multiplexing > > mechanism in the sync IOCTL since we already have one in the IOCTL mechanism. We can simply add a new sync IOCTL for new obscure sync cases if desired. > > I still don't think this ameliorates the need for batching: consider > the case where you update two disjoint screen regions and want them > both flushed. Either you issue two separate sync calls (which can be > disadvantageous performance-wise on some hardware / setups), or you > accumulate the regions and only flush later. So either two ioctls (one > in the style of dirtyfb and one to perform the sync/flush; you can > shortcut to assume the full buffer was damaged if the former is > missing), or one like this: > > struct dma_buf_sync_2d { > enum dma_buf_sync_flags flags; > > __u64 stride_bytes; > __u32 bytes_per_pixel; > __u32 num_regions; > > struct { > __u64 x; > __u64 y; > __u64 width; > __u64 height; > } regions[]; > }; I really feel like any kind of multi-range flush interface is feature bloat, and if we do it then we should only do it later on when there's a clear need for it. Afaiui dma-buf mmap will mostly be used for up/downloads, which means there will be some explicit copy involved somewhere anyway. So similar to userptr usage. And for those most often you just slurp in a linear block and then let the blitter rearrange it, at least on i915. Also thus far the consensus was that dma-buf doesn't care/know about content layout, adding strides/bytes/whatever does bend this quite a bit. And finally if we concentrate on the simple-case of one byte range we can focus the discussion on correctness, which imo is the important bit here really. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel