On Mon, 28 Oct 2024 12:08:49 +0100 Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > On Sat, 2024-10-26 at 16:48 +0100, Jonathan Cameron wrote: > > On Fri, 25 Oct 2024 20:40:42 +0200 (GMT+02:00) > > Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > > > > > Oct 25, 2024 18:42:02 David Lechner <dlechner@xxxxxxxxxxxx>: > > > > > > > On 10/25/24 8:24 AM, Nuno Sá wrote: > > > > > I still need to look better at this but I do have one though already > > > > > :) > > > > > > > > > > On Wed, 2024-10-23 at 15:59 -0500, David Lechner wrote: > > > > > > Add a new devm_iio_dmaengine_buffer_setup_ext2() function to handle > > > > > > cases where the DMA channel is managed by the caller rather than > > > > > > being > > > > > > requested and released by the iio_dmaengine module. > > > > > > > > > > > > Signed-off-by: David Lechner <dlechner@xxxxxxxxxxxx> > > > > > > --- > > > > > > > > > > > > v4 changes: > > > > > > * This replaces "iio: buffer-dmaengine: generalize requesting DMA > > > > > > channel" > > > > > > --- > > > > > > > > ... > > > > > > > > > > @@ -282,12 +281,38 @@ void iio_dmaengine_buffer_free(struct > > > > > > iio_buffer *buffer) > > > > > > iio_buffer_to_dmaengine_buffer(buffer); > > > > > > > > > > > > iio_dma_buffer_exit(&dmaengine_buffer->queue); > > > > > > - dma_release_channel(dmaengine_buffer->chan); > > > > > > - > > > > > > iio_buffer_put(buffer); > > > > > > + > > > > > > + if (dmaengine_buffer->owns_chan) > > > > > > + dma_release_channel(dmaengine_buffer->chan); > > > > > > > > > > Not sure if I agree much with this owns_chan flag. The way I see it, > > > > > we should always > > > > > handover the lifetime of the DMA channel to the IIO DMA framework. > > > > > Note that even the > > > > > device you pass in for both requesting the channel of the spi_offload > > > > > and for > > > > > setting up the DMA buffer is the same (and i suspect it will always > > > > > be) so I would > > > > > not go with the trouble. And with this assumption we could simplify a > > > > > bit more the > > > > > spi implementation. > > > > > > > > I tried something like this in v3 but Jonathan didn't seem to like it. > > > > > > > > https://lore.kernel.org/all/20240727144303.4a8604cb@jic23-huawei/ > > > > > > > > > > > > > > And not even related but I even suspect the current implementation > > > > > could be > > > > > problematic. Basically I'm suspecting that the lifetime of the DMA > > > > > channel should be > > > > > attached to the lifetime of the iio_buffer. IOW, we should only > > > > > release the channel > > > > > in iio_dmaengine_buffer_release() - in which case the current > > > > > implementation with the > > > > > spi_offload would also be buggy. > > > > > > > > The buffer can outlive the iio device driver that created the buffer? > > > > > > Yes, it can as the IIO device itself. In case a userspace app has an open > > > FD for the buffer chardev, we get a reference that is only released when > > > the FD is closed (which can outlive the device behind bound to its > > > driver). That is why we nullify indio_dev->info and check for it on the > > > read() and write() fops. > > > > > > FWIW, I raised concerns about this in the past (as we don't have any lock > > > in those paths) but Jonathan rightfully wanted to see a real race. And I > > > was too lazy to try and reproduce one but I'm still fairly sure we have > > > theoretical (at least) races in those paths. And one of them could be (I > > > think) concurrently hitting a DMA submit block while the device is being > > > unbound. In that case the DMA chan would be already released and we could > > > still try to initiate a transfer. I did not check if that would crash or > > > something but it should still not happen. > > > > > There are a few places where I've been meaning to have another look > > at our protections during unregister. May well be problems hiding here > > and in general the thinking on how to do this in the kernel has slowly > > been changing so we might be able to clean things up in general. > > > > Yeah, I'm fairly sure things like [1] are not enough in preventing potential nasty > races (though they should be hard to trigger). OTOH, in [2], we do have proper > locking. > > Simple solution would be to use the info lock in the buffer read() and write() paths. > I do realize that's a fastpath but I don't think that would be such a contended lock. > But we can surely do better and RCU could be a good candidate for this (we could do > something similar to what gpiolib is doing) and I wouldn't expect it to be that > complicated to implement. Biggest issue by making info a __rcu pointer would be to > change all IIO drivers to set the pointer with rcu_assign_pointer(). Though during > probe there's no potential race so what we have today should be fine (just not sure > if things like sparse would not complain about the "raw" assignment). Not tried it, but could probably do something cheeky in iio_device_register() using a second pointer (maybe in a union with the first ;) So I agree, smells bad but I've not chased far enough to know how real the problems are. In many cases the read only accesses are to data, not to hardware so 'might' be fine. I'm just not sure. I'll try and get time for a close look but won't be until towards the end of November. Jonathan > > [1]: https://elixir.bootlin.com/linux/v6.12-rc4/source/drivers/iio/industrialio-buffer.c#L176 > [2]: https://elixir.bootlin.com/linux/v6.12-rc4/source/drivers/iio/industrialio-core.c#L1825 > > > - Nuno Sá > > > >