On Wed, May 04, 2022 at 03:34:04PM -0500, Jason Ekstrand wrote: > This patch is analogous to the previous sync file export patch in that > it allows you to import a sync_file into a dma-buf. Unlike the previous > patch, however, this does add genuinely new functionality to dma-buf. > Without this, the only way to attach a sync_file to a dma-buf is to > submit a batch to your driver of choice which waits on the sync_file and > claims to write to the dma-buf. Even if said batch is a no-op, a submit > is typically way more overhead than just attaching a fence. A submit > may also imply extra synchronization with other work because it happens > on a hardware queue. > > In the Vulkan world, this is useful for dealing with the out-fence from > vkQueuePresent. Current Linux window-systems (X11, Wayland, etc.) all > rely on dma-buf implicit sync. Since Vulkan is an explicit sync API, we > get a set of fences (VkSemaphores) in vkQueuePresent and have to stash > those as an exclusive (write) fence on the dma-buf. We handle it in > Mesa today with the above mentioned dummy submit trick. This ioctl > would allow us to set it directly without the dummy submit. > > This may also open up possibilities for GPU drivers to move away from > implicit sync for their kernel driver uAPI and instead provide sync > files and rely on dma-buf import/export for communicating with other > implicit sync clients. > > We make the explicit choice here to only allow setting RW fences which > translates to an exclusive fence on the dma_resv. There's no use for > read-only fences for communicating with other implicit sync userspace > and any such attempts are likely to be racy at best. When we got to > insert the RW fence, the actual fence we set as the new exclusive fence > is a combination of the sync_file provided by the user and all the other > fences on the dma_resv. This ensures that the newly added exclusive > fence will never signal before the old one would have and ensures that > we don't break any dma_resv contracts. We require userspace to specify > RW in the flags for symmetry with the export ioctl and in case we ever > want to support read fences in the future. > > There is one downside here that's worth documenting: If two clients > writing to the same dma-buf using this API race with each other, their > actions on the dma-buf may happen in parallel or in an undefined order. > Both with and without this API, the pattern is the same: Collect all > the fences on dma-buf, submit work which depends on said fences, and > then set a new exclusive (write) fence on the dma-buf which depends on > said work. The difference is that, when it's all handled by the GPU > driver's submit ioctl, the three operations happen atomically under the > dma_resv lock. If two userspace submits race, one will happen before > the other. You aren't guaranteed which but you are guaranteed that > they're strictly ordered. If userspace manages the fences itself, then > these three operations happen separately and the two render operations > may happen genuinely in parallel or get interleaved. However, this is a > case of userspace racing with itself. As long as we ensure userspace > can't back the kernel into a corner, it should be fine. > > v2 (Jason Ekstrand): > - Use a wrapper dma_fence_array of all fences including the new one > when importing an exclusive fence. > > v3 (Jason Ekstrand): > - Lock around setting shared fences as well as exclusive > - Mark SIGNAL_SYNC_FILE as a read-write ioctl. > - Initialize ret to 0 in dma_buf_wait_sync_file > > v4 (Jason Ekstrand): > - Use the new dma_resv_get_singleton helper > > v5 (Jason Ekstrand): > - Rename the IOCTLs to import/export rather than wait/signal > - Drop the WRITE flag and always get/set the exclusive fence > > v6 (Jason Ekstrand): > - Split import and export into separate patches > - New commit message > > v7 (Daniel Vetter): > - Fix the uapi header to use the right struct in the ioctl > - Use a separate dma_buf_import_sync_file struct > - Add kerneldoc for dma_buf_import_sync_file > > v8 (Jason Ekstrand): > - Rebase on Christian König's fence rework > > Signed-off-by: Jason Ekstrand <jason@xxxxxxxxxxxxxx> > Cc: Christian König <christian.koenig@xxxxxxx> > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx> > Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx> > --- > drivers/dma-buf/dma-buf.c | 36 ++++++++++++++++++++++++++++++++++++ > include/uapi/linux/dma-buf.h | 22 ++++++++++++++++++++++ > 2 files changed, 58 insertions(+) > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index 529e0611e53b..68aac6f694f9 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -383,6 +383,40 @@ static long dma_buf_export_sync_file(struct dma_buf *dmabuf, > put_unused_fd(fd); > return ret; > } > + > +static long dma_buf_import_sync_file(struct dma_buf *dmabuf, > + const void __user *user_data) > +{ > + struct dma_buf_import_sync_file arg; > + struct dma_fence *fence; > + enum dma_resv_usage usage; > + int ret = 0; > + > + if (copy_from_user(&arg, user_data, sizeof(arg))) > + return -EFAULT; > + > + if (arg.flags != DMA_BUF_SYNC_RW) I think the flag validation here looks wrong? I think needs needs the exact same 3 checks as the export ioctl. > + return -EINVAL; > + > + fence = sync_file_get_fence(arg.fd); > + if (!fence) > + return -EINVAL; > + > + usage = (arg.flags & DMA_BUF_SYNC_WRITE) ? DMA_RESV_USAGE_WRITE : > + DMA_RESV_USAGE_READ; > + > + dma_resv_lock(dmabuf->resv, NULL); > + > + ret = dma_resv_reserve_fences(dmabuf->resv, 1); > + if (!ret) > + dma_resv_add_fence(dmabuf->resv, fence, usage); > + > + dma_resv_unlock(dmabuf->resv); > + > + dma_fence_put(fence); > + > + return ret; > +} > #endif > > static long dma_buf_ioctl(struct file *file, > @@ -431,6 +465,8 @@ static long dma_buf_ioctl(struct file *file, > #if IS_ENABLED(CONFIG_SYNC_FILE) > case DMA_BUF_IOCTL_EXPORT_SYNC_FILE: > return dma_buf_export_sync_file(dmabuf, (void __user *)arg); > + case DMA_BUF_IOCTL_IMPORT_SYNC_FILE: > + return dma_buf_import_sync_file(dmabuf, (const void __user *)arg); > #endif > > default: > diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h > index 46f1e3e98b02..913119bf2201 100644 > --- a/include/uapi/linux/dma-buf.h > +++ b/include/uapi/linux/dma-buf.h > @@ -119,6 +119,27 @@ struct dma_buf_export_sync_file { > __s32 fd; > }; > > +/** > + * struct dma_buf_import_sync_file - Insert a sync_file into a dma-buf > + * > + * Userspace can perform a DMA_BUF_IOCTL_IMPORT_SYNC_FILE to insert a > + * sync_file into a dma-buf for the purposes of implicit synchronization > + * with other dma-buf consumers. This allows clients using explicitly > + * synchronized APIs such as Vulkan to inter-op with dma-buf consumers > + * which expect implicit synchronization such as OpenGL or most media > + * drivers/video. > + */ > +struct dma_buf_import_sync_file { > + /** > + * @flags: Read/write flags > + * > + * Must be DMA_BUF_SYNC_RW. The checks are wrong, but the intent of your implementation looks a lot more like you allow both SYNC_WRITE and SYNC_READ, and I think that makes a lot of sense. Especially since we can now true sync-less access for vk with DMA_RESV_USAGE_BOOKKEEPING, so allowing userspace to explicit set read will be needed. Or does vk only allow you to set write fences anyway? That would suck for the vk app + gl compositor case a bit, so I hope not. > + */ > + __u32 flags; > + /** @fd: Sync file descriptor */ > + __s32 fd; > +}; > + > #define DMA_BUF_BASE 'b' > #define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync) > > @@ -129,5 +150,6 @@ struct dma_buf_export_sync_file { > #define DMA_BUF_SET_NAME_A _IOW(DMA_BUF_BASE, 1, u32) > #define DMA_BUF_SET_NAME_B _IOW(DMA_BUF_BASE, 1, u64) > #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE _IOWR(DMA_BUF_BASE, 2, struct dma_buf_export_sync_file) > +#define DMA_BUF_IOCTL_IMPORT_SYNC_FILE _IOW(DMA_BUF_BASE, 3, struct dma_buf_import_sync_file) With the flag nits sorted out: Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx> > > #endif > -- > 2.36.0 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch