On Sat, 2024-10-26 at 16:48 +0100, Jonathan Cameron wrote: > On Fri, 25 Oct 2024 20:40:42 +0200 (GMT+02:00) > Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > > > Oct 25, 2024 18:42:02 David Lechner <dlechner@xxxxxxxxxxxx>: > > > > > On 10/25/24 8:24 AM, Nuno Sá wrote: > > > > I still need to look better at this but I do have one though already > > > > :) > > > > > > > > On Wed, 2024-10-23 at 15:59 -0500, David Lechner wrote: > > > > > Add a new devm_iio_dmaengine_buffer_setup_ext2() function to handle > > > > > cases where the DMA channel is managed by the caller rather than > > > > > being > > > > > requested and released by the iio_dmaengine module. > > > > > > > > > > Signed-off-by: David Lechner <dlechner@xxxxxxxxxxxx> > > > > > --- > > > > > > > > > > v4 changes: > > > > > * This replaces "iio: buffer-dmaengine: generalize requesting DMA > > > > > channel" > > > > > --- > > > > > > ... > > > > > > > > @@ -282,12 +281,38 @@ void iio_dmaengine_buffer_free(struct > > > > > iio_buffer *buffer) > > > > > iio_buffer_to_dmaengine_buffer(buffer); > > > > > > > > > > iio_dma_buffer_exit(&dmaengine_buffer->queue); > > > > > - dma_release_channel(dmaengine_buffer->chan); > > > > > - > > > > > iio_buffer_put(buffer); > > > > > + > > > > > + if (dmaengine_buffer->owns_chan) > > > > > + dma_release_channel(dmaengine_buffer->chan); > > > > > > > > Not sure if I agree much with this owns_chan flag. The way I see it, > > > > we should always > > > > handover the lifetime of the DMA channel to the IIO DMA framework. > > > > Note that even the > > > > device you pass in for both requesting the channel of the spi_offload > > > > and for > > > > setting up the DMA buffer is the same (and i suspect it will always > > > > be) so I would > > > > not go with the trouble. And with this assumption we could simplify a > > > > bit more the > > > > spi implementation. > > > > > > I tried something like this in v3 but Jonathan didn't seem to like it. > > > > > > https://lore.kernel.org/all/20240727144303.4a8604cb@jic23-huawei/ > > > > > > > > > > > And not even related but I even suspect the current implementation > > > > could be > > > > problematic. Basically I'm suspecting that the lifetime of the DMA > > > > channel should be > > > > attached to the lifetime of the iio_buffer. IOW, we should only > > > > release the channel > > > > in iio_dmaengine_buffer_release() - in which case the current > > > > implementation with the > > > > spi_offload would also be buggy. > > > > > > The buffer can outlive the iio device driver that created the buffer? > > > > Yes, it can as the IIO device itself. In case a userspace app has an open > > FD for the buffer chardev, we get a reference that is only released when > > the FD is closed (which can outlive the device behind bound to its > > driver). That is why we nullify indio_dev->info and check for it on the > > read() and write() fops. > > > > FWIW, I raised concerns about this in the past (as we don't have any lock > > in those paths) but Jonathan rightfully wanted to see a real race. And I > > was too lazy to try and reproduce one but I'm still fairly sure we have > > theoretical (at least) races in those paths. And one of them could be (I > > think) concurrently hitting a DMA submit block while the device is being > > unbound. In that case the DMA chan would be already released and we could > > still try to initiate a transfer. I did not check if that would crash or > > something but it should still not happen. > > > There are a few places where I've been meaning to have another look > at our protections during unregister. May well be problems hiding here > and in general the thinking on how to do this in the kernel has slowly > been changing so we might be able to clean things up in general. > Yeah, I'm fairly sure things like [1] are not enough in preventing potential nasty races (though they should be hard to trigger). OTOH, in [2], we do have proper locking. Simple solution would be to use the info lock in the buffer read() and write() paths. I do realize that's a fastpath but I don't think that would be such a contended lock. But we can surely do better and RCU could be a good candidate for this (we could do something similar to what gpiolib is doing) and I wouldn't expect it to be that complicated to implement. Biggest issue by making info a __rcu pointer would be to change all IIO drivers to set the pointer with rcu_assign_pointer(). Though during probe there's no potential race so what we have today should be fine (just not sure if things like sparse would not complain about the "raw" assignment). [1]: https://elixir.bootlin.com/linux/v6.12-rc4/source/drivers/iio/industrialio-buffer.c#L176 [2]: https://elixir.bootlin.com/linux/v6.12-rc4/source/drivers/iio/industrialio-core.c#L1825 - Nuno Sá