Re: [RFC] the generic thermal layer enhancement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Thu, May 31, 2012 at 02:09:19PM +0800, Zhang Rui wrote:
> Hi, Amit,
> 
> On 四, 2012-05-31 at 12:59 +0800, Amit Kachhap wrote:
> > Hi Rui,
> > 
> > Thanks for starting a discussion on further finetuning the thermal layer.
> > 
> > On 30 May 2012 16:49, Zhang Rui <rui.zhang@xxxxxxxxx> wrote:
> > > Hi, all,
> > >
> > > It is great to see more and more users of the generic thermal layer.
> > > But as we know, the original design of the generic thermal layer comes
> > > from ACPI thermal management, and some of its implementation seems to be
> > > too ACPI specific nowadays.
> > Totally agreed :)
> > >
> > > Recently I'm thinking of enhance the generic thermal layer so that it
> > > works well for more platforms.
> > >
> > > Below are some thoughts of mine, after reading the patches from Amit
> > > Daniel Kachhap, and ACPI 3.0 thermal model. Actually, I have started
> > > coding some RFC patches. But I do really want to get feedback from you
> > > before going on.
> > >
> > > G1. supporting multiple cooling states for active cooling devices.
> > >
> > >    The current active cooling device supports two cooling states only,
> > >    please refer to the code below, in driver/thermal/thermal_sys.c
> > >                case THERMAL_TRIP_ACTIVE:
> > >                        ...
> > >                        if (temp >= trip_temp)
> > >                                cdev->ops->set_cur_state(cdev, 1);
> > >                        else
> > >                                cdev->ops->set_cur_state(cdev, 0);
> > >                        break;
> > >
> > >    This is an ACPI specific thing, as our ACPI FAN used to support
> > >    ON/OFF only.
> > >    I think it is reasonable to support multiple active cooling states
> > >    as they are common on many platforms, and note that this is also
> > >    true for ACPI 3.0 FAN device (_FPS).
> > Yes I agree that ACTIVE trip type should support more than 1(ON)
> > state. and which state the the set_cur_state should call depends on
> > how the state is binded on the trip point. But again in doing these
> > there is so much logic put in handling this that the I dropped the
> > patch for new trip type ACTIVE_INSTANCE.
> 
> yes, I know.
> 
> > >
> > > G2. introduce cooling states range for a certain trip point
> > >
> > >    This problem comes with the first one.
> > >    If the cooling devices support multiple cooling states, and surely
> > >    we may want only several cooling states for a certain trip point,
> > >    and other cooling states for other active trip points.
> > >    To do this, we should be able to describe the cooling device
> > >    behavior for a certain trip point, rather than for the entire
> > >    thermal zone.
> > Agreed
> > >
> > > G3. kernel thermal passive cooling algorithm
> > >
> > >    Currently, tc1 and tc2 are hard requirements for kernel passive
> > >    cooling. But non-ACPI platforms do not have this information
> > >    (please correct me if I'm wrong).
> > >    Say, for the patches here
> > >    http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> > >    They just want to slow down the processor when current temperature
> > >    is higher than the trip point and speed up the processor when the
> > >    temperature is lower than the trip point.
> > >
> > >    According to Matthew, the platform drivers are responsible to
> > >    provide proper tc1 and tc2 values to use kernel passive cooling.
> > >    But I'm just wondering if we can use something instead.
> > >    Say, introduce .get_trend() in thermal_zone_device_ops.
> > >    And we set cur_state++ or cur_state-- based on the value returned
> > >    by .get_trend(), instead of using tc1 and tc2.
> > OK this seems fine. But for exynos platform(also some other platforms)
> > this may be simply cur_temp - last_temp which is fine.
> 
> I'm wondering if you can try to test some platform tc1/tc2 numbers and
> see if it is better to use the current passive cooling implementation.

Well, we can definitely try it out with TC1 and TC2, but what is the
right procedure to define them properly other than try and error?

> 
> > >
> > > G4. Multiple passive trip points
> > >
> > >    I get this idea also from the patches at
> > >    http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> > >
> > >    IMO, they want to get an acceptable performance at a tolerable
> > >    temperature.
> > >    Say, a platform with four P-states. P3 is really low.
> > >    And I'm okay with the temperature at 60C, but 80C? No.
> > >    With G2 resolved, we can use processor P0~P2 for Passive trip point
> > >    0 (50C), and P3 for Passive trip point 1 (70C). And then the
> > >    temperature may be jumping at around 60C or even 65C, without
> > >    entering P3.
> > >
> > >    Further more, IMO, this also works for ACPI platforms.
> > >    Say, we can easily change p-state to cool the system, but using
> > >    t-state is definitely what we do not want to see. The current
> > >    implementation does not expose this difference to the generic
> > >    thermal layer, but if we can have two passive trip points, and use
> > >    p-state for the first one only... (this works if we start polling
> > >    after entering passive cooling mode, without hardware notification)
> > This seems cool and to answer Mathew doubt this is needed because in
> > the not so critical thermal envelop the platforms may not want to go
> > to the lowest opp level and some intermediate level is fine.
> > 
> > >
> > > G5. unify active cooling and passive cooling code
> > >
> > >    If G4 and G5 are resolved, a new problem to me is that there is no
> > >    difference between passive cooling and active cooling except the
> > >    cooling policy.
> > >    Then we can share the same code for both active and passive cooling.
> > >    maybe something like:
> > >
> > >    case THERMAL_TRIP_ACTIVE:
> > >    case THERMAL_TRIP_PASSIVE:
> > >         ...
> > >         tz->ops->get_trend();
> > >         if (trend == HEATING)
> > >                 cdev->ops->set_cur_state(cdev, cur_state++);
> > >         else if (trend == COOLING)
> > >                 cdev->ops->set_cur_state(cdev, cur_state--);
> > >         break;
> > Actually I still feels that ACTIVE and PASSIVE are needed because they
> > have different way of calling the set_cur_rate. ACTIVE trip type is
> > associated with 1 trip point so on reaching a trip point a specific
> > state is called in set_cur_state. But PASSIVE basically
> > increments/decrements the state depending on thermal trend.
> 
> Sorry, I do not understand.
> for active trip points, we can either use trend or not.
> say, we can always set the trend to HEATING, in our .get_trend()
> callback if we always want to spin up the fan when the temperature is
> higher than the trip point.
> But if the temperature high but it is dropping, and we want to spin down
> the fan a little to save more power, we can let .get_trend return
> COOLING at this time.
> what do you think?
> 
> BTW, what I'd like to know is that, with all this gaps solved, is it
> much easier and clearer for the generic cpu cooling APIs and the exynos
> platforms to follow the generic thermal layer? Say, two passive trip
> points for exynos platforms, and certain processor P-states bind to each
> trip point?
> 
> thanks,
> rui
> 
_______________________________________________
linux-pm mailing list
linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/linux-pm



[Index of Archives]     [Linux ACPI]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [CPU Freq]     [Kernel Newbies]     [Fedora Kernel]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux