Re: [PATCH v2] thermal: core: Do not fail cdev registration because of invalid initial state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday, June 6, 2024 6:39:02 PM CEST Daniel Lezcano wrote:
> On 06/06/2024 18:08, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > 
> > It is reported that commit 31a0fa0019b0 ("thermal/debugfs: Pass cooling
> > device state to thermal_debug_cdev_add()") causes the ACPI fan driver
> > to fail probing on some systems which turns out to be due to the _FST
> > control method returning an invalid value until _FSL is first evaluated
> > for the given fan.  If this happens, the .get_cur_state() cooling device
> > callback returns an error and __thermal_cooling_device_register() fails
> > as uses that callback after commit 31a0fa0019b0.
> > 
> > Arguably, _FST should not return an invalid value even if it is
> > evaluated before _FSL, so this may be regarded as a platform firmware
> > issue, but at the same time it is not a good enough reason for failing
> > the cooling device registration where the initial cooling device state
> > is only needed to initialize a thermal debug facility.
> > 
> > Accordingly, modify __thermal_cooling_device_register() to avoid calling
> > thermal_debug_cdev_add() instead of returning an error if the initial
> > .get_cur_state() callback invocation fails.
> > 
> > Fixes: 31a0fa0019b0 ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()")
> > Closes: https://lore.kernel.org/linux-acpi/20240530153727.843378-1-laura.nao@xxxxxxxxxxxxx
> > Reported-by: Laura Nao <laura.nao@xxxxxxxxxxxxx>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > ---
> > 
> > v1 -> v2:
> >     * Instead of making the thermal debug code effectively ignore the invalid
> >       initial cooling device state, simply don't register thermal debugfs for
> >       a cooling device if its initial state returned by the driver's
> >       .get_cur_state() is invalid (Daniel).
> > 
> > Laura, please test this one even though I don't see why it wouldn't work for
> > you if the v1 did.
> > 
> > ---
> >   drivers/thermal/thermal_core.c |    5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > Index: linux-pm/drivers/thermal/thermal_core.c
> > ===================================================================
> > --- linux-pm.orig/drivers/thermal/thermal_core.c
> > +++ linux-pm/drivers/thermal/thermal_core.c
> > @@ -1001,7 +1001,7 @@ __thermal_cooling_device_register(struct
> >   
> >   	ret = cdev->ops->get_cur_state(cdev, &current_state);
> >   	if (ret)
> > -		goto out_cdev_type;
> > +		current_state = ULONG_MAX;
> 
> Why not move the section ? So we end up below.

Because it's fewer changes in this patch and it doesn't matter.

> >   
> >   	thermal_cooling_device_setup_sysfs(cdev);
> >   
> > @@ -1016,7 +1016,8 @@ __thermal_cooling_device_register(struct
> >   		return ERR_PTR(ret);
> >   	}
> >   
> > -	thermal_debug_cdev_add(cdev, current_state);
> > +	if (current_state <= cdev->max_state)
> > +		thermal_debug_cdev_add(cdev, current_state);
> 
> 	ret = cdev->ops->get_cur_state(cdev, &current_state);
> 	if (!ret)
> 		thermal_debug_cdev_add(cdev, current_state);

So the current_state <= cdev->max_state is actually useful regardless and
consistent with the analogous check in thermal_cooling_device_update().

> Additionally a comment here to explain why get_cur_state can fail and 
> telling it is up to the driver to fix its routine?

I've added a comment in v3.

> 
> >   	/* Add 'this' new cdev to the global cdev list */
> >   	mutex_lock(&thermal_list_lock);
> > 
> > 
> > 
> 
> 








[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux