Re: [PATCH v5 2/5] iio: consumers: copy/release available info from producer to fix race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Jonathan Cameron (2024-10-30 21:30:50)
> On Wed, 30 Oct 2024 19:23:21 +0100
> Matteo Martelli <matteomartelli3@xxxxxxxxx> wrote:
> 
> > Quoting Andy Shevchenko (2024-10-30 15:47:50)
> > > On Mon, Oct 21, 2024 at 02:54:15PM +0200, Matteo Martelli wrote:  
> > > > Consumers need to call the producer's read_avail_release_resource()
> > > > callback after reading producer's available info. To avoid a race
> > > > condition with the producer unregistration, change inkern
> > > > iio_channel_read_avail() so that it copies the available info from the
> > > > producer and immediately calls its release callback with info_exists
> > > > locked.
> > > > 
> > > > Also, modify the users of iio_read_avail_channel_raw() and
> > > > iio_read_avail_channel_attribute() to free the copied available buffers
> > > > after calling these functions. To let users free the copied buffer with
> > > > a cleanup pattern, also add a iio_read_avail_channel_attr_retvals()
> > > > consumer helper that is equivalent to iio_read_avail_channel_attribute()
> > > > but stores the available values in the returned variable.  
> > > 
> > > ...
> > >   
> > > > +static void dpot_dac_read_avail_release_res(struct iio_dev *indio_dev,
> > > > +                                         struct iio_chan_spec const *chan,
> > > > +                                         const int *vals, long mask)
> > > > +{
> > > > +     kfree(vals);
> > > > +}
> > > > +
> > > >  static int dpot_dac_write_raw(struct iio_dev *indio_dev,
> > > >                             struct iio_chan_spec const *chan,
> > > >                             int val, int val2, long mask)
> > > > @@ -125,6 +132,7 @@ static int dpot_dac_write_raw(struct iio_dev *indio_dev,
> > > >  static const struct iio_info dpot_dac_info = {
> > > >       .read_raw = dpot_dac_read_raw,
> > > >       .read_avail = dpot_dac_read_avail,
> > > > +     .read_avail_release_resource = dpot_dac_read_avail_release_res,
> > > >       .write_raw = dpot_dac_write_raw,
> > > >  };  
> > > 
> > > I have a problem with this approach. The issue is that we allocate
> > > memory in one place and must clear it in another. This is not well
> > > designed thingy in my opinion. I was thinking a bit of the solution and
> > > at least these two comes to my mind:
> > > 
> > > 1) having a special callback for .read_avail_with_copy (choose better
> > > name) that will dump the data to the intermediate buffer and clean it
> > > after all;
> > > 
> > > 2) introduce a new type (or bit there), like IIO_AVAIL_LIST_ALLOC.  
> > 
> > Could you elaborate more about these potential solutions? Maybe with some
> > usage examples?
> > 
> > If I get it correctly, in both cases you are suggesting to pass ownership
> > of the vals buffer to the caller, iio_read_channel_info_avail() in this
> > case, so that it would take care of freeing the buffer after calling
> > iio_format_after_*(). We considered this approach during an initial
> > discussion with Jonathan (see read_avail_ext() in [1]), where he suggested
> > to let the driver keep the release control through a callback for two
> > reasons:
> > 
> > 1) Apparently it's a bad pattern to pass the buffer ownership to the core,
> >    maybe Jonathan can elaborate why? The risk I can think of is that the driver
> >    could still keep the buffer copy in its private data after giving it away,
> >    resulting in fact in a double ownership. However I think it would be clear
> >    enough in this case that the copy should be handled by the caller, or maybe
> >    not?
> Mostly the lack of desire to have to copy for the 95% of cases where it's
> not needed and that it prevents any optimization like you mention.

I think the suggestion here is to add an additional .read_avail_with_copy()
without replacing the original .read_avail(), so all the current drivers that
use a constant avail list would not be affected. And I think this was the same
idea for the additional read_avail_ext() or the additional argument for the
read_avail() we were considering in [1]. So I would think that
iio_read_channel_info_avail() would do something like the following:

    if (indio_dev->info->read_avail_with_copy)
        indio_dev->info->read_avail_with_copy(vals);
    else
        indio_dev->info->read_avail(vals);

    ...
    iio_format_avail_list(vals);
    ...

    if (indio_dev->info->read_avail_with_copy)
        kfree(vals);

And the drivers would choose whether to define the read_avail or the
read_avail_with_copy.

What I was referring to is that, back then, you mentioned you would have
preferred to avoid passing ownership of the buffer around:

> That's a corner case we should think about closing. Would require an indicator
> to read_avail that the buffer it has been passed is a snapshot that it should
> free on completion of the string building.  I don't like passing ownership
> of data around like that, but it is fiddly to do anything else given
> any simple double buffering is subject to race conditions.

I guess there is some other reason other than avoiding the copy when not
necessary, since by introducing an additional function or argument or return
type, most of the unnecessary copies would already be avoided right?

Anyway any of this solutions would still prevent the potential optimizations of
point 2). It's worth mentioning that those kind of optimizations are currently
not adopted by any driver.

> 
> Jonathan
> > 
> > 2) Some driver might want to avoid allocating a new copy of a big table if
> >    the race does not occur (e.g. with additional checks on buffer access
> >    code) and thus wouldn't call a free() in the release callback.
> > 
> > > 
> > > In any case it looks fragile and not scalable. I propose to drop this
> > > and think again.  
> > 
> > I see your concerns, I am open to reconsider this in case we come up with
> > better solution after addressing the points above.
> > 
> > > Yes, yes, I'm fully aware about the problem you are trying to solve and
> > > agree on the report, I think this solution is not good enough.
> > > 
> > > -- 
> > > With Best Regards,
> > > Andy Shevchenko
> > >   
> > 
> > [1]: https://lore.kernel.org/linux-iio/20240729211100.0d602d6e@jic23-huawei/
> > 
> > Best regards,
> > Matteo Martelli
> 

I hope I've brought a little more clarity to the discussion by providing some
history instead of making it more confusing.

Best regards,
Matteo Martelli





[Index of Archives]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux