Re: [PATCH 1/3] iio: light: Add driver for ap3216c

Sven Van Asbroeck <thesven73@xxxxxxxxx> · Mon, 18 Feb 2019 14:35:51 -0500

Hi Jonathan,

Thanks again for your clear and extensive feedback !

On Mon, Feb 18, 2019 at 10:16 AM Jonathan Cameron
<jonathan.cameron@xxxxxxxxxx> wrote:
>
> I suspect that would break lots of devices if it happened, but
> fair enough that explicit might be good.  One option would be
> to document clearly in regmap the requirement that bulk read is ordered.
>

Yes, it would be interesting to hear the regmap people's opinion on ordering.
In the mean time, we can make this explicit.
Re-reading the thread, I can also see that Peter Meerwald-Stadler was first
to spot this race condition.

> What we need to guarantee is:
>
> 1) If the sensor reads on an occasion where the threshold is passed, we do not miss the event
>    The event is the threshold being passed, not the existence of the reading, or how many
>    readings etc.
>
> 2) A data read will result in a value.  There is no guarantee that it will match with the
>    event.  All manner of delays could result in new data having occurred before that read.
>

My feedback was based on two incorrect assumptions:
a. the interrupt fires whenever new PS/ALS values become available (wrong)
b. there are strict consistency guarantees between the THRESH event, and what
userspace will read out (also wrong)

Taking that into account, I am 100% in agreement with your other comments.
Thank you so much for the explanation!

There is one exception, though:

> > +static int ap3216c_write_event_config(struct iio_dev *indio_dev,
> > +                                    const struct iio_chan_spec *chan,
> > +                                    enum iio_event_type type,
> > +                                    enum iio_event_direction dir, int state)
> > +{
> > +       struct ap3216c_data *data = iio_priv(indio_dev);
> > +
> > +       switch (chan->type) {
> > +       case IIO_LIGHT:
> > +               data->als_thresh_en = state;
> > +               return 0;
> > +
> > +       case IIO_PROXIMITY:
> > +               data->prox_thresh_en = state;
> > +               return 0;
> > +
> > +       default:
> > +               return -EINVAL;
> > +       }
> > +static irqreturn_t ap3216c_event_handler(int irq, void *p)
> > +{
> > + if ((status & AP3216C_INT_STATUS_PS_MASK) && data->prox_thresh_en)
> > + iio_push_event(...);
> > +
> >
> > I think this may not work as intended. One thread (userspace) writes
> > a variable, another thread (threaded irq handler) checks it. but there
> > is no explicit or implicit memory barrier. So when userspace activates
> > thresholding, it may take a long time for the handler to 'see' it !
>
> Yes.  But if userspace took a while to get around to writing this value,
> it would also take longer...  It's not time critical exactly when you
> enable the event.  One can create cases where someone might
> care, but they are pretty obscure.
>

Are you sure? I suspect that it's perfectly possible for the threaded irq
handler not to 'see' the store to (als|prox)_thresh_en for a _very_ long time.

AFAIK only a memory barrier will guarantee that the handler 'sees' the store
right away. A lock will do - it issues an implicit memory barrier.

Most drivers use a lock to guarantee visibility. There are a few drivers that
resort to explicit barriers to make a flag visible from one thread to another.

E.g. search for mb() or wmb() in:
drivers/input/keyboard/matrix_keypad.c
drivers/input/misc/cm109.c
drivers/input/misc/yealink.c