Re: [PATCH] thermal: sysfs: Perform bounds check when storing thermal states

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 06, 2022 at 02:30:21PM +0200, Varad Gautam wrote:
> On Wed, Jul 6, 2022 at 12:21 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Wed, Jul 06, 2022 at 12:01:19PM +0200, Varad Gautam wrote:
> > > On Wed, Jul 6, 2022 at 11:21 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > On Wed, Jul 06, 2022 at 04:51:59PM +0800, Zhang Rui wrote:
> > > > > On Wed, 2022-07-06 at 09:16 +0200, Varad Gautam wrote:
> > > > > > On Wed, Jul 6, 2022 at 8:45 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > > wrote:
> > > > > > >
> > > > > > > On Tue, Jul 05, 2022 at 11:02:50PM +0200, Varad Gautam wrote:
> > > > > > > > On Tue, Jul 5, 2022 at 6:18 PM Greg KH <
> > > > > > > > gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Jul 05, 2022 at 03:00:02PM +0000, Varad Gautam wrote:
> > > > > > > > > > Check that a user-provided thermal state is within the
> > > > > > > > > > maximum
> > > > > > > > > > thermal states supported by a given driver before attempting
> > > > > > > > > > to
> > > > > > > > > > apply it. This prevents a subsequent OOB access in
> > > > > > > > > > thermal_cooling_device_stats_update() while performing
> > > > > > > > > > state-transition accounting on drivers that do not have this
> > > > > > > > > > check
> > > > > > > > > > in their set_cur_state() handle.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Varad Gautam <varadgautam@xxxxxxxxxx>
> > > > > > > > > > Cc: stable@xxxxxxxxxxxxxxx
> > > > > > > > > > ---
> > > > > > > > > >  drivers/thermal/thermal_sysfs.c | 12 +++++++++++-
> > > > > > > > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/thermal/thermal_sysfs.c
> > > > > > > > > > b/drivers/thermal/thermal_sysfs.c
> > > > > > > > > > index 1c4aac8464a7..0c6b0223b133 100644
> > > > > > > > > > --- a/drivers/thermal/thermal_sysfs.c
> > > > > > > > > > +++ b/drivers/thermal/thermal_sysfs.c
> > > > > > > > > > @@ -607,7 +607,7 @@ cur_state_store(struct device *dev,
> > > > > > > > > > struct device_attribute *attr,
> > > > > > > > > >               const char *buf, size_t count)
> > > > > > > > > >  {
> > > > > > > > > >       struct thermal_cooling_device *cdev =
> > > > > > > > > > to_cooling_device(dev);
> > > > > > > > > > -     unsigned long state;
> > > > > > > > > > +     unsigned long state, max_state;
> > > > > > > > > >       int result;
> > > > > > > > > >
> > > > > > > > > >       if (sscanf(buf, "%ld\n", &state) != 1)
> > > > > > > > > > @@ -618,10 +618,20 @@ cur_state_store(struct device *dev,
> > > > > > > > > > struct device_attribute *attr,
> > > > > > > > > >
> > > > > > > > > >       mutex_lock(&cdev->lock);
> > > > > > > > > >
> > > > > > > > > > +     result = cdev->ops->get_max_state(cdev, &max_state);
> > > > > > > > > > +     if (result)
> > > > > > > > > > +             goto unlock;
> > > > > > > > > > +
> > > > > > > > > > +     if (state > max_state) {
> > > > > > > > > > +             result = -EINVAL;
> > > > > > > > > > +             goto unlock;
> > > > > > > > > > +     }
> > > > > > > > > > +
> > > > > > > > > >       result = cdev->ops->set_cur_state(cdev, state);
> > > > > > > > >
> > > > > > > > > Why doesn't set_cur_state() check the max state before setting
> > > > > > > > > it?  Why
> > > > > > > > > are the callers forced to always check it before?  That feels
> > > > > > > > > wrong...
> > > > > > > > >
> > > > > > > >
> > > > > > > > The problem lies in thermal_cooling_device_stats_update(), not
> > > > > > > > set_cur_state().
> > > > > > > >
> > > > > > > > If ->set_cur_state() doesn't error out on invalid state,
> > > > > > > > thermal_cooling_device_stats_update() does a:
> > > > > > > >
> > > > > > > > stats->trans_table[stats->state * stats->max_states +
> > > > > > > > new_state]++;
> > > > > > > >
> > > > > > > > stats->trans_table reserves space depending on max_states, but
> > > > > > > > we'd end up
> > > > > > > > reading/writing outside it. cur_state_store() can prevent this
> > > > > > > > regardless of
> > > > > > > > the driver's ->set_cur_state() implementation.
> > > > > > >
> > > > > > > Why wouldn't cur_state_store() check for an out-of-bounds condition
> > > > > > > by
> > > > > > > calling get_max_state() and then return an error if it is invalid,
> > > > > > > preventing thermal_cooling_device_stats_update() from ever being
> > > > > > > called?
> > > > > > >
> > > > > >
> > > > > > That's what this patch does, it adds the out-of-bounds check.
> > > > >
> > > > > No, I think Greg' question is
> > > > > why cdev->ops->set_cur_state() return 0 when setting a cooling state
> > > > > that exceeds the maximum cooling state?
> > > >
> > > > Yes, that is what I am asking, it should not allow a state to be
> > > > exceeded.
> > > >
> > >
> > > Indeed, it is upto the driver to return !0 from cdev->ops->set_cur_state()
> > > when setting state > max - and it is a driver bug for not doing so.
> > >
> > > But a buggy driver should not lead to cur_state_store() performing an OOB
> > > access.
> >
> > Agreed, which is why the code that does the access should check before
> > it does so.  Right now you are relying on the sysfs code to do so, which
> > seems very wrong.
> >
> 
> I see the point.
> 
> The OOB access happens in thermal_cooling_device_stats_update().
> 
> By placing the check in cur_state_store(), I'm trying to ensure
> two things for a buggy driver:

What in-kernel driver has this problem, and why not just fix it there?

> 1. The driver's cdev->ops->set_cur_state() doesn't get called if
> the new state is > max state. This is to prevent the driver
> from storing the new (invalid) state internally. If the driver
> didn't realise/reject an invalid state, chances are it will try
> to propagate it internally and take actions according to that,
> which can have side effects on system stability.

Again, set_cur_state() should check for max values, if not, it is broken
and that needs to be fixed in the driver.

> 2. The kernel doesn't do an OOB access in
> thermal_cooling_device_stats_update().

Then don't allow thermal_cooling_device_stats_update() to do an out of
band access by fixing it there too.  But again, your patch does not
solve that directly.

thanks,

greg k-h



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux