Re: [PATCH 0/4] thermal threshold and notification v2.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2013-04-15 at 11:53 -0600, R, Durgadoss wrote:
> > -----Original Message-----
> > From: lm-sensors-bounces@xxxxxxxxxxxxxx [mailto:lm-sensors-
> > bounces@xxxxxxxxxxxxxx] On Behalf Of Srinivas Pandruvada
> > Sent: Monday, April 15, 2013 8:51 PM
> > To: Guenter Roeck
> > Cc: Yu, Fenghua; Luck, Tony; linux-pm@xxxxxxxxxxxxxxx; lm-sensors@lm-
> > sensors.org; bp@xxxxxxxxx; Zhang, Rui
> > Subject: Re:  [PATCH 0/4] thermal threshold and notification v2.0
> > 
> > On 04/12/2013 06:32 PM, Guenter Roeck wrote:
> > > On Tue, Apr 09, 2013 at 02:01:18PM -0700, Srinivas Pandruvada wrote:
> > >> v2.0
> > >> As suggested by Guenter Roeck, used the previous development in this
> > area
> > >> as starting point. The first patch is same as what Guenter Roeck submitted
> > >> before except for checkpatch error for strtoul.As per this patch, the
> > following
> > >> additional coretemp sysfs entries will be added:
> > >> tempX_threshold1 - Reflects value of CPU thermal threshold T0.
> > >> tempX_threshold1_triggered
> > >> 	         - Reflects status of CPU thermal status register bit 6
> > >> 		   (THERM_STATUS_THRESHOLD0).
> > >> tempX_threshold2 - Reflects value of CPU thermal threshold T1.
> > >> tempX_threshold2_triggered
> > >> 	         - Reflects status of CPU thermal status register bit 8
> > >> 		   (THERM_STATUS_THRESHOLD1).
> > >>
> > >>
> > >> The notification mechanism is implemented for package level by using
> > uevent.
> > >> Also a debugfs interface is added to check count of interrupts and worker
> > fn
> > >> scheduling.
> > >>
> > >>
> > >> v1.0
> > >>
> > >> This is clear that there is reluctance in adding thresholds in coretemp
> > sysfs,
> > >> during previous attempts. Proably because of lake of use cases.
> > >> But this time use case may be more compelling.
> > >>
> > >> We have many small form factor devices like ultrabooks, slate PCs in the
> > market.
> > >> Unfortunately these devices reach maximum temperature with relatively
> > less
> > >> workloads, causing BIOS to do thermal throttling. There are real
> > performance
> > >> issues due to aggressive BIOS action to control thermals and also thermal
> > breakdown
> > >> in some cases.
> > >>
> > >> Even the most expensive laptops, don't have correct ACPI thermal
> > configuration,
> > >> so that kernel thermal driver can act. In some case even the trip point is
> > higher
> > >> than critical temperature setting.
> > >>
> > >> Intel has developed several drivers, which can be used to cool the system
> > very efficiently.
> > >> They include RAPL based cooling driver, Powerclamp driver and P state
> > driver.
> > >> To utilize these cooling device a closed loop user mode program is
> > required, which
> > >> will utilize these method and dynamically compensate for high CPU
> > temperatures,
> > >> without relying on any configuration data.
> > >> One such solution is developed is "Linux thermal daemon". More details
> > can be
> > >> obtained from
> > >>
> > "https://github.com/01org/thermal_daemon/blob/master/ThermalDaemon
> > _Introduction.pdf".
> > >> This daemon polls for cpu temperature and apply compensation once the
> > CPU reach target
> > >> temperature.
> > >>
> > >> This polling can be mostly avoided, by getting notification for the
> > temperature, where
> > >> it needs to wake up and get ready for apply compensation. In most of the
> > normal use
> > >> cases, there may not be any threshold events. So very minimal number of
> > user space
> > >> notification for thermal thresholds.
> > >>
> > >> Notification are added only for package level thresholds, to minimize
> > events. Also
> > >> interrupts are enabled only when a non tj_max(default) value is written
> > to thresholds.
> > >>
> > >> Once thresholds are violated, it uses a rate control of 5 seconds, reducing
> > the number
> > >> of interrupts, when temperature is hanging around trip point. Using the
> > sticky log bit,
> > >> it sends kboject uevent change notification for corresponding package
> > sysfs.
> > >> Once the thermal daemon receives notification, it can change to new
> > threshold or act
> > >> immediately to reduce CPU temperature.*
> > >>
> > >> Guenter Roeck (1):
> > >>    hwmon: (coretemp) Add support for thermal threshold attributes
> > >>
> > >> Srinivas Pandruvada (3):
> > >>    x86, mcheck, therm_throt: Process package thresholds
> > >>    hwmon: (coretemp) : Add notification support
> > >>    hwmon: (coretemp) : Add debugfs to support thresholds
> > >>
> > >>   Documentation/hwmon/coretemp             |   8 +
> > >>   arch/x86/include/asm/mce.h               |   7 +
> > >>   arch/x86/kernel/cpu/mcheck/therm_throt.c |  63 ++++++-
> > >>   drivers/hwmon/coretemp.c                 | 292
> > +++++++++++++++++++++++++++++--
> > >>   4 files changed, 356 insertions(+), 14 deletions(-)
> > >>
> > > Rui,
> > >
> > > can you have a look at this series ?
> > >
> > > I would like to get some feedback from thermal subsystem supporters if
> > hwmon
> > > is really the right place for this. I may be wrong, but it seems to me it would
> > > better fit into thermal.
> > >
> > > Thanks,
> > > Guenter
> > 
> > I am fine using thermal zones, but the coretemp will be duplicated in
> > both coretemp and thermal sysfs and lot of code duplication. Also trip
> > point in this case is not for activating any cooling device, but just to
> > notify user space. So this will be a zone with no associated cdevs.
> 
> Yes, this was the idea which we discussed in lm-sensors a few months ago.
> [I could not locate the thread in the web]. Except that we will register as
> 'thermal sensor' and not as 'thermal zones' because of the changes happening
> to the thermal framework recently[1].
> 
> This way, we can expose trip points and configure them, without having a
> need to associate any cdevs.
> 
Agreed.
But note that this is also true in current code as well.

thanks,
rui
> [1] https://lkml.org/lkml/2013/2/5/228

> Thanks,
> Durga
> > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >
> > 
> > 
> > _______________________________________________
> > lm-sensors mailing list
> > lm-sensors@xxxxxxxxxxxxxx
> > http://lists.lm-sensors.org/mailman/listinfo/lm-sensors



_______________________________________________
lm-sensors mailing list
lm-sensors@xxxxxxxxxxxxxx
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors




[Index of Archives]     [Linux Kernel]     [Linux Hardware Monitoring]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux