On Tue, Mar 29, 2022 at 10:11:14AM +0100, Paul Cercueil wrote: > Hi Daniel, > > Le mar., mars 29 2022 at 10:33:32 +0200, Daniel Vetter <daniel@xxxxxxxx> a > écrit : > > On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote: > > > Hi Jonathan, > > > > > > Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron > > > <jic23@xxxxxxxxxx> a écrit : > > > > On Mon, 7 Feb 2022 12:59:21 +0000 > > > > Paul Cercueil <paul@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > Hi Jonathan, > > > > > > > > > > This is the V2 of my patchset that introduces a new userspace > > > > > interface > > > > > based on DMABUF objects to complement the fileio API, and adds > > > > > write() > > > > > support to the existing fileio API. > > > > > > > > Hi Paul, > > > > > > > > It's been a little while. Perhaps you could summarize the various > > > view > > > > points around the appropriateness of using DMABUF for this? > > > > I appreciate it is a tricky topic to distil into a brief summary > > > but > > > > I know I would find it useful even if no one else does! > > > > > > So we want to have a high-speed interface where buffers of samples > > > are > > > passed around between IIO devices and other devices (e.g. USB or > > > network), > > > or made available to userspace without copying the data. > > > > > > DMABUF is, at least in theory, exactly what we need. Quoting the > > > documentation > > > (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html): > > > "The dma-buf subsystem provides the framework for sharing buffers > > > for > > > hardware (DMA) access across multiple device drivers and > > > subsystems, and for > > > synchronizing asynchronous hardware access. This is used, for > > > example, by > > > drm “prime” multi-GPU support, but is of course not limited to GPU > > > use > > > cases." > > > > > > The problem is that right now DMABUF is only really used by DRM, > > > and to > > > quote Daniel, "dma-buf looks like something super generic and > > > useful, until > > > you realize that there's a metric ton of gpu/accelerator bagage > > > piled in". > > > > > > Still, it seems to be the only viable option. We could add a custom > > > buffer-passing interface, but that would mean implementing the same > > > buffer-passing interface on the network and USB stacks, and before > > > we know > > > it we re-invented DMABUFs. > > > > dma-buf also doesn't support sharing with network and usb stacks, so I'm > > a > > bit confused why exactly this is useful? > > There is an attempt to get dma-buf support in the network stack, called > "zctap". Last patchset was sent last november. USB stack does not support > dma-buf, but we can add it later I guess. > > > So yeah unless there's some sharing going on with gpu stuff (for data > > processing maybe) I'm not sure this makes a lot of sense really. Or at > > least some zero-copy sharing between drivers, but even that would > > minimally require a dma-buf import ioctl of some sorts. Which I either > > missed or doesn't exist. > > We do want zero-copy between drivers, the network stack, and the USB stack. > It's not just about having a userspace interface. I think in that case we need these other pieces too. And we need acks from relevant subsystems that these other pieces are a) ready for upstream merging and also that the dma-buf side of things actually makes sense. > > If there's none of that then just hand-roll your buffer handling code > > (xarray is cheap to use in terms of code for this), you can always add > > dma-buf import/export later on when the need arises. > > > > Scrolling through patches you only have dma-buf export, but no > > importing, > > so the use-case that works is with one of the existing subsystems that > > supporting dma-buf importing. > > > > I think minimally we need the use-case (in form of code) that needs the > > buffer sharing here. > > I'll try with zctap and report back. Do you have a link for this? I just checked dri-devel on lore, and it's not there. Nor anywhere else. We really need all the pieces, and if block layer reaction is anything to judge by, dma-buf wont happen for networking either. There's some really nasty and fairly fundamental issues with locking and memory reclaim that make this utter pain or outright impossible. -Daniel > > Cheers, > -Paul > > > > > > > > > > > Changes since v1: > > > > > > > > > > - the patches that were merged in v1 have been (obviously) > > > dropped > > > > > from > > > > > this patchset; > > > > > - the patch that was setting the write-combine cache setting > > > has > > > > > been > > > > > dropped as well, as it was simply not useful. > > > > > - [01/12]: > > > > > * Only remove the outgoing queue, and keep the incoming > > > queue, > > > > > as we > > > > > want the buffer to start streaming data as soon as it is > > > > > enabled. > > > > > * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now > > > functionally > > > > > the > > > > > same as IIO_BLOCK_STATE_DONE. > > > > > - [02/12]: > > > > > * Fix block->state not being reset in > > > > > iio_dma_buffer_request_update() for output buffers. > > > > > * Only update block->bytes_used once and add a comment > > > about > > > > > why we > > > > > update it. > > > > > * Add a comment about why we're setting a different state > > > for > > > > > output > > > > > buffers in iio_dma_buffer_request_update() > > > > > * Remove useless cast to bool (!!) in iio_dma_buffer_io() > > > > > - [05/12]: > > > > > Only allow the new IOCTLs on the buffer FD created with > > > > > IIO_BUFFER_GET_FD_IOCTL(). > > > > > - [12/12]: > > > > > * Explicitly state that the new interface is optional and > > > is > > > > > not implemented by all drivers. > > > > > * The IOCTLs can now only be called on the buffer FD > > > returned by > > > > > IIO_BUFFER_GET_FD_IOCTL. > > > > > * Move the page up a bit in the index since it is core > > > stuff > > > > > and not > > > > > driver-specific. > > > > > > > > > > The patches not listed here have not been modified since v1. > > > > > > > > > > Cheers, > > > > > -Paul > > > > > > > > > > Alexandru Ardelean (1): > > > > > iio: buffer-dma: split iio_dma_buffer_fileio_free() function > > > > > > > > > > Paul Cercueil (11): > > > > > iio: buffer-dma: Get rid of outgoing queue > > > > > iio: buffer-dma: Enable buffer write support > > > > > iio: buffer-dmaengine: Support specifying buffer direction > > > > > iio: buffer-dmaengine: Enable write support > > > > > iio: core: Add new DMABUF interface infrastructure > > > > > iio: buffer-dma: Use DMABUFs instead of custom solution > > > > > iio: buffer-dma: Implement new DMABUF based userspace API > > > > > iio: buffer-dmaengine: Support new DMABUF based userspace API > > > > > iio: core: Add support for cyclic buffers > > > > > iio: buffer-dmaengine: Add support for cyclic buffers > > > > > Documentation: iio: Document high-speed DMABUF based API > > > > > > > > > > Documentation/driver-api/dma-buf.rst | 2 + > > > > > Documentation/iio/dmabuf_api.rst | 94 +++ > > > > > Documentation/iio/index.rst | 2 + > > > > > drivers/iio/adc/adi-axi-adc.c | 3 +- > > > > > drivers/iio/buffer/industrialio-buffer-dma.c | 610 > > > > > ++++++++++++++---- > > > > > .../buffer/industrialio-buffer-dmaengine.c | 42 +- > > > > > drivers/iio/industrialio-buffer.c | 60 ++ > > > > > include/linux/iio/buffer-dma.h | 38 +- > > > > > include/linux/iio/buffer-dmaengine.h | 5 +- > > > > > include/linux/iio/buffer_impl.h | 8 + > > > > > include/uapi/linux/iio/buffer.h | 30 + > > > > > 11 files changed, 749 insertions(+), 145 deletions(-) > > > > > create mode 100644 Documentation/iio/dmabuf_api.rst > > > > > > > > > > > > > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch