On Wed, 30 Oct 2024 16:47:50 +0200 Andy Shevchenko <andriy.shevchenko@xxxxxxxxx> wrote: > On Mon, Oct 21, 2024 at 02:54:15PM +0200, Matteo Martelli wrote: > > Consumers need to call the producer's read_avail_release_resource() > > callback after reading producer's available info. To avoid a race > > condition with the producer unregistration, change inkern > > iio_channel_read_avail() so that it copies the available info from the > > producer and immediately calls its release callback with info_exists > > locked. > > > > Also, modify the users of iio_read_avail_channel_raw() and > > iio_read_avail_channel_attribute() to free the copied available buffers > > after calling these functions. To let users free the copied buffer with > > a cleanup pattern, also add a iio_read_avail_channel_attr_retvals() > > consumer helper that is equivalent to iio_read_avail_channel_attribute() > > but stores the available values in the returned variable. > > ... > > > +static void dpot_dac_read_avail_release_res(struct iio_dev *indio_dev, > > + struct iio_chan_spec const *chan, > > + const int *vals, long mask) > > +{ > > + kfree(vals); > > +} > > + > > static int dpot_dac_write_raw(struct iio_dev *indio_dev, > > struct iio_chan_spec const *chan, > > int val, int val2, long mask) > > @@ -125,6 +132,7 @@ static int dpot_dac_write_raw(struct iio_dev *indio_dev, > > static const struct iio_info dpot_dac_info = { > > .read_raw = dpot_dac_read_raw, > > .read_avail = dpot_dac_read_avail, > > + .read_avail_release_resource = dpot_dac_read_avail_release_res, > > .write_raw = dpot_dac_write_raw, > > }; > > I have a problem with this approach. The issue is that we allocate > memory in one place and must clear it in another. This is not well > designed thingy in my opinion. It is a tricky corner and we've not yet come up with a better solution :( I think one of the earlier versions did just always copy and the reviews suggested that was painful given we are fixing a tiny percentage of devices. Hence we ended up with what is effectively an optional copy if the provider knows the data is volatile. So there are two 'potential' copies here and we need to be careful to separate them for purposes of discussion. A) Copy in provider if it has volatile available data. In that case the copy is done in a call to it via read_avail, and release via a call to read_avail_release_resource(). So to my mind locally the same as any acquire / release pair. B) Copy in the core for the case where we need the lifetime to persist. That is a effectively a kmemdup() call so we could call back to the core to release it but it would just be a kfree() wrapper. (A) Only occurs in a tiny subset of drivers, most use non volatile data for read avail (constant, or constant after probe). (B) Only occurs for consumer drivers that directly use the avail data. There are very few of those and no other sane way of solving this because we can't hold a lock into the provider for an unknown (long) time. > I was thinking a bit of the solution and > at least these two comes to my mind: > > 1) having a special callback for .read_avail_with_copy (choose better > name) that will dump the data to the intermediate buffer and clean it > after all; So we have that allocate the data in the provider and hand it to the consumer which then frees it with kfree() in all cases? Note that's what we do for the inkern interfaces (the ones consumer drivers have to use), just in the core not the providers because that corner is hard to close any other way. In this rare case we end up potentially copying twice. For the special cases where the buffer isn't passed on beyond functions that are part of the IIO core, we avoid the need for that (potentially second, probably only) copy because we can always ensure the release call is made. Note this is the common case by far. It's the pretty printing done by the core to present this data to sysfs etc, or to find out the max value for some consumer that doesn't need the whole set. > > 2) introduce a new type (or bit there), like IIO_AVAIL_LIST_ALLOC. The special handling that will need seems likely to be no more obvious than the handling we have here. I'm not really sure how it would work. > > In any case it looks fragile and not scalable. I propose to drop this > and think again. > > Yes, yes, I'm fully aware about the problem you are trying to solve and > agree on the report, I think this solution is not good enough. I'll back this out of my tree for now so the discussion can carry on. Jonathan