Hi On Tue, Dec 22, 2015 at 10:36 PM, Tiago Vignatti <tiago.vignatti@xxxxxxxxx> wrote: > From: Daniel Vetter <daniel.vetter@xxxxxxxx> > > The userspace might need some sort of cache coherency management e.g. when CPU > and GPU domains are being accessed through dma-buf at the same time. To > circumvent this problem there are begin/end coherency markers, that forward > directly to existing dma-buf device drivers vfunc hooks. Userspace can make use > of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be > used like following: > - mmap dma-buf fd > - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write > to mmap area 3. SYNC_END ioctl. This can be repeated as often as you > want (with the new data being consumed by the GPU or say scanout device) > - munmap once you don't need the buffer any more > > v2 (Tiago): Fix header file type names (u64 -> __u64) > v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end > dma-buf functions. Check for overflows in start/length. > v4 (Tiago): use 2d regions for sync. > v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and > remove range information from struct dma_buf_sync. > v6 (Tiago): use __u64 structured padded flags instead enum. Adjust > documentation about the recommendation on using sync ioctls. > v7 (Tiago): Alex' nit on flags definition and being even more wording in the > doc about sync usage. > > Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx> > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> > Signed-off-by: Tiago Vignatti <tiago.vignatti@xxxxxxxxx> > --- > Documentation/dma-buf-sharing.txt | 21 ++++++++++++++++++- > drivers/dma-buf/dma-buf.c | 43 +++++++++++++++++++++++++++++++++++++++ > include/uapi/linux/dma-buf.h | 38 ++++++++++++++++++++++++++++++++++ > 3 files changed, 101 insertions(+), 1 deletion(-) > create mode 100644 include/uapi/linux/dma-buf.h > > diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt > index 4f4a84b..32ac32e 100644 > --- a/Documentation/dma-buf-sharing.txt > +++ b/Documentation/dma-buf-sharing.txt > @@ -350,7 +350,26 @@ Being able to mmap an export dma-buf buffer object has 2 main use-cases: > handles, too). So it's beneficial to support this in a similar fashion on > dma-buf to have a good transition path for existing Android userspace. > > - No special interfaces, userspace simply calls mmap on the dma-buf fd. > + No special interfaces, userspace simply calls mmap on the dma-buf fd, making > + sure that the cache synchronization ioctl (DMA_BUF_IOCTL_SYNC) is *always* > + used when the access happens. This is discussed next paragraphs. > + > + Some systems might need some sort of cache coherency management e.g. when > + CPU and GPU domains are being accessed through dma-buf at the same time. To > + circumvent this problem there are begin/end coherency markers, that forward > + directly to existing dma-buf device drivers vfunc hooks. Userspace can make > + use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence > + would be used like following: > + - mmap dma-buf fd > + - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write > + to mmap area 3. SYNC_END ioctl. This can be repeated as often as you > + want (with the new data being consumed by the GPU or say scanout device) > + - munmap once you don't need the buffer any more > + > + Therefore, for correctness and optimal performance, systems with the memory > + cache shared by the GPU and CPU i.e. the "coherent" and also the > + "incoherent" are always required to use SYNC_START and SYNC_END before and > + after, respectively, when accessing the mapped address. > > 2. Supporting existing mmap interfaces in importers > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index b2ac13b..9a298bd 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -34,6 +34,8 @@ > #include <linux/poll.h> > #include <linux/reservation.h> > > +#include <uapi/linux/dma-buf.h> > + > static inline int is_dma_buf_file(struct file *); > > struct dma_buf_list { > @@ -251,11 +253,52 @@ out: > return events; > } > > +static long dma_buf_ioctl(struct file *file, > + unsigned int cmd, unsigned long arg) > +{ > + struct dma_buf *dmabuf; > + struct dma_buf_sync sync; > + enum dma_data_direction direction; > + > + dmabuf = file->private_data; > + > + if (!is_dma_buf_file(file)) > + return -EINVAL; Why? This can never happen, and you better not use dma_buf_ioctl() outside of dma_buf_fops.. I guess it's simply copied from the other fop callbacks, but I don't see why. dma_buf_poll() doesn't do it, neither should this, or one of the other 3 callbacks. > + > + switch (cmd) { > + case DMA_BUF_IOCTL_SYNC: > + if (copy_from_user(&sync, (void __user *) arg, sizeof(sync))) > + return -EFAULT; > + > + if (sync.flags & DMA_BUF_SYNC_RW) > + direction = DMA_BIDIRECTIONAL; > + else if (sync.flags & DMA_BUF_SYNC_READ) > + direction = DMA_FROM_DEVICE; > + else if (sync.flags & DMA_BUF_SYNC_WRITE) > + direction = DMA_TO_DEVICE; > + else > + return -EINVAL; This looks bogus. It always ends up being "DMA_BIDIRECTIONAL" or EINVAL. I recommend changing it to: switch (sync.flags & DMA_BUF_SYNC_RW) { case DMA_BUF_SYNC_READ: direction = DMA_FROM_DEVICE; break; case DMA_BUF_SYNC_WRITE: direction = DMA_TO_DEVICE; break; case DMA_BUF_SYNC_READ: direction = DMA_BIDIRECTIONAL; break; default: return -EINVAL; } > + > + if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK) > + return -EINVAL; Why isn't this done immediately after copy_from_user()? > + > + if (sync.flags & DMA_BUF_SYNC_END) > + dma_buf_end_cpu_access(dmabuf, direction); > + else > + dma_buf_begin_cpu_access(dmabuf, direction); Why are SYNC_START and SYNC_END exclusive? It feels very natural to me to invoke both at the same time (especially if two objects are stored in the same dma-buf). > + > + return 0; > + default: > + return -ENOTTY; > + } > +} > + > static const struct file_operations dma_buf_fops = { > .release = dma_buf_release, > .mmap = dma_buf_mmap_internal, > .llseek = dma_buf_llseek, > .poll = dma_buf_poll, > + .unlocked_ioctl = dma_buf_ioctl, > }; > > /* > diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h > new file mode 100644 > index 0000000..4a9b36b > --- /dev/null > +++ b/include/uapi/linux/dma-buf.h > @@ -0,0 +1,38 @@ > +/* > + * Framework for buffer objects that can be shared across devices/subsystems. > + * > + * Copyright(C) 2015 Intel Ltd > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms of the GNU General Public License version 2 as published by > + * the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License along with > + * this program. If not, see <http://www.gnu.org/licenses/>. > + */ > + > +#ifndef _DMA_BUF_UAPI_H_ > +#define _DMA_BUF_UAPI_H_ > + > +/* begin/end dma-buf functions used for userspace mmap. */ > +struct dma_buf_sync { > + __u64 flags; > +}; Please add '#include <linux/types.h>', otherwise this header cannot be compiled on its own (missing __u64). > + > +#define DMA_BUF_SYNC_READ (1 << 0) > +#define DMA_BUF_SYNC_WRITE (2 << 0) > +#define DMA_BUF_SYNC_RW (DMA_BUF_SYNC_READ | DMA_BUF_SYNC_WRITE) > +#define DMA_BUF_SYNC_START (0 << 2) > +#define DMA_BUF_SYNC_END (1 << 2) > +#define DMA_BUF_SYNC_VALID_FLAGS_MASK \ > + (DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END) > + > +#define DMA_BUF_BASE 'b' > +#define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync) Why _IOW? A read-only ioctl should be able to call DMA_BUF_SYNC_READ, right? Thanks David > + > +#endif _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel