On Wed, Aug 19, 2020 at 1:51 PM Robin Murphy <robin.murphy@xxxxxxx> wrote: > > Hi Tomasz, > > On 2020-08-19 12:16, Tomasz Figa wrote: > > Hi Christoph, > > > > On Wed, Aug 19, 2020 at 8:56 AM Christoph Hellwig <hch@xxxxxx> wrote: > >> > >> The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused, > > > > Could you explain what makes you think it's unused? It's a feature of > > the UAPI generally supported by the videobuf2 framework and relied on > > by Chromium OS to get any kind of reasonable performance when > > accessing V4L2 buffers in the userspace. > > > >> and causes > >> weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is > >> unimplemented except on PARISC and some MIPS configs, and about to be > >> removed. > > > > It is implemented by the generic DMA mapping layer [1], which is used > > by a number of architectures including ARM64 and supposed to be used > > by new architectures going forward. > > AFAICS all that V4L2_FLAG_MEMORY_NON_CONSISTENT does is end up > controling whether DMA_ATTR_NON_CONSISTENT is added to vb2_queue::dma_attrs. > > Please can you point to where DMA_ATTR_NON_CONSISTENT does anything at > all on arm64? > With the default config it doesn't, but with CONFIG_DMA_NONCOHERENT_CACHE_SYNC enabled it makes dma_pgprot() keep the pgprot value as is, without enforcing coherence attributes. > Also, I posit that videobuf2 is not actually relying on > DMA_ATTR_NON_CONSISTENT anyway, since it's clearly not using it properly: > > "By using this API, you are guaranteeing to the platform > that you have all the correct and necessary sync points for this memory > in the driver should it choose to return non-consistent memory." > > $ git grep dma_cache_sync drivers/media > $ AFAIK dma_cache_sync() isn't the only way to perform the cache synchronization. The earlier patch series that I reviewed relied on dma_get_sgtable() and then dma_sync_sg_*() (which existed in the vb2-dc since forever [1]). However, it looks like with the final code the sgtable isn't acquired and the synchronization isn't happening, so you have a point. FWIW, I asked back in time what the plan is for non-coherent allocations and it seemed like DMA_ATTR_NON_CONSISTENT and dma_sync_*() was supposed to be the right thing to go with. [2] The same thread also explains why dma_alloc_pages() isn't suitable for the users of dma_alloc_attrs() and DMA_ATTR_NON_CONSISTENT. I think we could make a deal here. We could revert back the parts using DMA_ATTR_NON_CONSISTENT, keeping the UAPI intact, but just rendering it no-op, since it's just a hint after all. Then, you would propose a proper, functionally equivalent and working for ARM64, replacement for dma_alloc_attrs(..., DMA_ATTR_NON_CONSISTENT), which we could then use to enable the functionality expected by this UAPI. Does it sound like something that could work as a way forward here? By the way, as a videobuf2 reviewer, I'd appreciate being CC'd on any series related to the subsystem-facing DMA API changes, since videobuf2 is one of the biggest users of it. [1] https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/media/common/videobuf2/videobuf2-dma-contig.c#L98 [2] https://patchwork.kernel.org/comment/23312203/ Best regards, Tomasz > > Robin. > > > [1] https://elixir.bootlin.com/linux/v5.9-rc1/source/kernel/dma/mapping.c#L341 > > > > When removing features from generic kernel code, I'd suggest first > > providing viable alternatives for its users, rather than killing the > > users altogether. > > > > Given the above, I'm afraid I have to NAK this. > > > > Best regards, > > Tomasz > > > >> > >> Signed-off-by: Christoph Hellwig <hch@xxxxxx> > >> --- > >> .../userspace-api/media/v4l/buffer.rst | 17 --------- > >> .../media/v4l/vidioc-reqbufs.rst | 1 - > >> .../media/common/videobuf2/videobuf2-core.c | 36 +------------------ > >> .../common/videobuf2/videobuf2-dma-contig.c | 19 ---------- > >> .../media/common/videobuf2/videobuf2-dma-sg.c | 3 +- > >> .../media/common/videobuf2/videobuf2-v4l2.c | 12 ------- > >> include/media/videobuf2-core.h | 3 +- > >> include/uapi/linux/videodev2.h | 2 -- > >> 8 files changed, 3 insertions(+), 90 deletions(-) > >> > >> diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst > >> index 57e752aaf414a7..2044ed13cd9d7d 100644 > >> --- a/Documentation/userspace-api/media/v4l/buffer.rst > >> +++ b/Documentation/userspace-api/media/v4l/buffer.rst > >> @@ -701,23 +701,6 @@ Memory Consistency Flags > >> :stub-columns: 0 > >> :widths: 3 1 4 > >> > >> - * .. _`V4L2-FLAG-MEMORY-NON-CONSISTENT`: > >> - > >> - - ``V4L2_FLAG_MEMORY_NON_CONSISTENT`` > >> - - 0x00000001 > >> - - A buffer is allocated either in consistent (it will be automatically > >> - coherent between the CPU and the bus) or non-consistent memory. The > >> - latter can provide performance gains, for instance the CPU cache > >> - sync/flush operations can be avoided if the buffer is accessed by the > >> - corresponding device only and the CPU does not read/write to/from that > >> - buffer. However, this requires extra care from the driver -- it must > >> - guarantee memory consistency by issuing a cache flush/sync when > >> - consistency is needed. If this flag is set V4L2 will attempt to > >> - allocate the buffer in non-consistent memory. The flag takes effect > >> - only if the buffer is used for :ref:`memory mapping <mmap>` I/O and the > >> - queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS > >> - <V4L2-BUF-CAP-SUPPORTS-MMAP-CACHE-HINTS>` capability. > >> - > >> .. c:type:: v4l2_memory > >> > >> enum v4l2_memory > >> diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst > >> index 75d894d9c36c42..3180c111d368ee 100644 > >> --- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst > >> +++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst > >> @@ -169,7 +169,6 @@ aborting or finishing any DMA in progress, an implicit > >> - This capability is set by the driver to indicate that the queue supports > >> cache and memory management hints. However, it's only valid when the > >> queue is used for :ref:`memory mapping <mmap>` streaming I/O. See > >> - :ref:`V4L2_FLAG_MEMORY_NON_CONSISTENT <V4L2-FLAG-MEMORY-NON-CONSISTENT>`, > >> :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>` and > >> :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>`. > >> > >> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c > >> index f544d3393e9d6b..66a41cef33c1b1 100644 > >> --- a/drivers/media/common/videobuf2/videobuf2-core.c > >> +++ b/drivers/media/common/videobuf2/videobuf2-core.c > >> @@ -721,39 +721,14 @@ int vb2_verify_memory_type(struct vb2_queue *q, > >> } > >> EXPORT_SYMBOL(vb2_verify_memory_type); > >> > >> -static void set_queue_consistency(struct vb2_queue *q, bool consistent_mem) > >> -{ > >> - q->dma_attrs &= ~DMA_ATTR_NON_CONSISTENT; > >> - > >> - if (!vb2_queue_allows_cache_hints(q)) > >> - return; > >> - if (!consistent_mem) > >> - q->dma_attrs |= DMA_ATTR_NON_CONSISTENT; > >> -} > >> - > >> -static bool verify_consistency_attr(struct vb2_queue *q, bool consistent_mem) > >> -{ > >> - bool queue_is_consistent = !(q->dma_attrs & DMA_ATTR_NON_CONSISTENT); > >> - > >> - if (consistent_mem != queue_is_consistent) { > >> - dprintk(q, 1, "memory consistency model mismatch\n"); > >> - return false; > >> - } > >> - return true; > >> -} > >> - > >> int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory, > >> unsigned int flags, unsigned int *count) > >> { > >> unsigned int num_buffers, allocated_buffers, num_planes = 0; > >> unsigned plane_sizes[VB2_MAX_PLANES] = { }; > >> - bool consistent_mem = true; > >> unsigned int i; > >> int ret; > >> > >> - if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT) > >> - consistent_mem = false; > >> - > >> if (q->streaming) { > >> dprintk(q, 1, "streaming active\n"); > >> return -EBUSY; > >> @@ -765,8 +740,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory, > >> } > >> > >> if (*count == 0 || q->num_buffers != 0 || > >> - (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) || > >> - !verify_consistency_attr(q, consistent_mem)) { > >> + (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory)) { > >> /* > >> * We already have buffers allocated, so first check if they > >> * are not in use and can be freed. > >> @@ -803,7 +777,6 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory, > >> num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME); > >> memset(q->alloc_devs, 0, sizeof(q->alloc_devs)); > >> q->memory = memory; > >> - set_queue_consistency(q, consistent_mem); > >> > >> /* > >> * Ask the driver how many buffers and planes per buffer it requires. > >> @@ -894,12 +867,8 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory, > >> { > >> unsigned int num_planes = 0, num_buffers, allocated_buffers; > >> unsigned plane_sizes[VB2_MAX_PLANES] = { }; > >> - bool consistent_mem = true; > >> int ret; > >> > >> - if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT) > >> - consistent_mem = false; > >> - > >> if (q->num_buffers == VB2_MAX_FRAME) { > >> dprintk(q, 1, "maximum number of buffers already allocated\n"); > >> return -ENOBUFS; > >> @@ -912,15 +881,12 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory, > >> } > >> memset(q->alloc_devs, 0, sizeof(q->alloc_devs)); > >> q->memory = memory; > >> - set_queue_consistency(q, consistent_mem); > >> q->waiting_for_buffers = !q->is_output; > >> } else { > >> if (q->memory != memory) { > >> dprintk(q, 1, "memory model mismatch\n"); > >> return -EINVAL; > >> } > >> - if (!verify_consistency_attr(q, consistent_mem)) > >> - return -EINVAL; > >> } > >> > >> num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers); > >> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > >> index ec3446cc45b8da..7b1b86ec942d7d 100644 > >> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > >> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > >> @@ -42,11 +42,6 @@ struct vb2_dc_buf { > >> struct dma_buf_attachment *db_attach; > >> }; > >> > >> -static inline bool vb2_dc_buffer_consistent(unsigned long attr) > >> -{ > >> - return !(attr & DMA_ATTR_NON_CONSISTENT); > >> -} > >> - > >> /*********************************************/ > >> /* scatterlist table functions */ > >> /*********************************************/ > >> @@ -341,13 +336,6 @@ static int > >> vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf, > >> enum dma_data_direction direction) > >> { > >> - struct vb2_dc_buf *buf = dbuf->priv; > >> - struct sg_table *sgt = buf->dma_sgt; > >> - > >> - if (vb2_dc_buffer_consistent(buf->attrs)) > >> - return 0; > >> - > >> - dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > >> return 0; > >> } > >> > >> @@ -355,13 +343,6 @@ static int > >> vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf, > >> enum dma_data_direction direction) > >> { > >> - struct vb2_dc_buf *buf = dbuf->priv; > >> - struct sg_table *sgt = buf->dma_sgt; > >> - > >> - if (vb2_dc_buffer_consistent(buf->attrs)) > >> - return 0; > >> - > >> - dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > >> return 0; > >> } > >> > >> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c > >> index 0a40e00f0d7e5c..a86fce5d8ea8bf 100644 > >> --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c > >> +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c > >> @@ -123,8 +123,7 @@ static void *vb2_dma_sg_alloc(struct device *dev, unsigned long dma_attrs, > >> /* > >> * NOTE: dma-sg allocates memory using the page allocator directly, so > >> * there is no memory consistency guarantee, hence dma-sg ignores DMA > >> - * attributes passed from the upper layer. That means that > >> - * V4L2_FLAG_MEMORY_NON_CONSISTENT has no effect on dma-sg buffers. > >> + * attributes passed from the upper layer. > >> */ > >> buf->pages = kvmalloc_array(buf->num_pages, sizeof(struct page *), > >> GFP_KERNEL | __GFP_ZERO); > >> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c > >> index 30caad27281e1a..de83ad48783821 100644 > >> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c > >> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c > >> @@ -722,20 +722,11 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps) > >> #endif > >> } > >> > >> -static void clear_consistency_attr(struct vb2_queue *q, > >> - int memory, > >> - unsigned int *flags) > >> -{ > >> - if (!q->allow_cache_hints || memory != V4L2_MEMORY_MMAP) > >> - *flags &= ~V4L2_FLAG_MEMORY_NON_CONSISTENT; > >> -} > >> - > >> int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req) > >> { > >> int ret = vb2_verify_memory_type(q, req->memory, req->type); > >> > >> fill_buf_caps(q, &req->capabilities); > >> - clear_consistency_attr(q, req->memory, &req->flags); > >> return ret ? ret : vb2_core_reqbufs(q, req->memory, > >> req->flags, &req->count); > >> } > >> @@ -769,7 +760,6 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create) > >> unsigned i; > >> > >> fill_buf_caps(q, &create->capabilities); > >> - clear_consistency_attr(q, create->memory, &create->flags); > >> create->index = q->num_buffers; > >> if (create->count == 0) > >> return ret != -EBUSY ? ret : 0; > >> @@ -998,7 +988,6 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv, > >> int res = vb2_verify_memory_type(vdev->queue, p->memory, p->type); > >> > >> fill_buf_caps(vdev->queue, &p->capabilities); > >> - clear_consistency_attr(vdev->queue, p->memory, &p->flags); > >> if (res) > >> return res; > >> if (vb2_queue_is_busy(vdev, file)) > >> @@ -1021,7 +1010,6 @@ int vb2_ioctl_create_bufs(struct file *file, void *priv, > >> > >> p->index = vdev->queue->num_buffers; > >> fill_buf_caps(vdev->queue, &p->capabilities); > >> - clear_consistency_attr(vdev->queue, p->memory, &p->flags); > >> /* > >> * If count == 0, then just check if memory and type are valid. > >> * Any -EBUSY result from vb2_verify_memory_type can be mapped to 0. > >> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h > >> index 52ef92049073e3..4c7f25b07e9375 100644 > >> --- a/include/media/videobuf2-core.h > >> +++ b/include/media/videobuf2-core.h > >> @@ -744,8 +744,7 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb); > >> * vb2_core_reqbufs() - Initiate streaming. > >> * @q: pointer to &struct vb2_queue with videobuf2 queue. > >> * @memory: memory type, as defined by &enum vb2_memory. > >> - * @flags: auxiliary queue/buffer management flags. Currently, the only > >> - * used flag is %V4L2_FLAG_MEMORY_NON_CONSISTENT. > >> + * @flags: auxiliary queue/buffer management flags. > >> * @count: requested buffer count. > >> * > >> * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called > >> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h > >> index c7b70ff53bc1dd..5c00f63d9c1b58 100644 > >> --- a/include/uapi/linux/videodev2.h > >> +++ b/include/uapi/linux/videodev2.h > >> @@ -191,8 +191,6 @@ enum v4l2_memory { > >> V4L2_MEMORY_DMABUF = 4, > >> }; > >> > >> -#define V4L2_FLAG_MEMORY_NON_CONSISTENT (1 << 0) > >> - > >> /* see also http://vektor.theorem.ca/graphics/ycbcr/ */ > >> enum v4l2_colorspace { > >> /* > >> -- > >> 2.28.0 > >> > >> _______________________________________________ > >> iommu mailing list > >> iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx > >> https://lists.linuxfoundation.org/mailman/listinfo/iommu > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > _______________________________________________ Nouveau mailing list Nouveau@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/nouveau