15.06.2021 19:18, Daniel Lezcano пишет: > On 15/06/2021 15:01, Dmitry Osipenko wrote: >> 15.06.2021 13:26, Viresh Kumar пишет: >>> On 15-06-21, 12:03, Daniel Lezcano wrote: >>>> >>>> [Cc Viresh] >>>> >>>> On 29/05/2021 19:09, Dmitry Osipenko wrote: >>>>> All NVIDIA Tegra30 SoCs have a two-channel on-chip sensor unit which >>>>> monitors temperature and voltage of the SoC. Sensors control CPU frequency >>>>> throttling, which is activated by hardware once preprogrammed temperature >>>>> level is breached, they also send signal to Power Management controller to >>>>> perform emergency shutdown on a critical overheat of the SoC die. Add >>>>> driver for the Tegra30 TSENSOR module, exposing it as a thermal sensor >>>>> and a cooling device. >>>> >>>> IMO it does not make sense to expose the hardware throttling mechanism >>>> as a cooling device because it is not usable anywhere from the thermal >>>> framework. >>>> >>>> Moreover, that will collide with the thermal / cpufreq framework >>>> mitigation (hardware sets the frequency but the software thinks the freq >>>> is different), right ? >> >> H/w mitigation is additional and should be transparent to the software >> mitigation. The software mitigation is much more flexible, but it has >> latency. Software also could crash and hang. >> >> Hardware mitigation doesn't have latency and it will continue to work >> regardless of the software state. > > Yes, I agree. Both solutions have their pros and cons. However, I don't > think they can co-exist sanely. > >> The CCF driver is aware about the h/w cooling status [1], hence software >> sees the actual frequency. >> >> [1] >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit?id=344d5df34f5abd468267daa98f041abf90b2f660 > > Ah interesting, thanks for the pointer. > > What I'm worried about is the consistency with cpufreq. > > Probably cpufreq_update_limits() should be called from the interrupt > handler. IIUC, the cpufreq already should be prepared for the case where firmware may override frequency. Viresh, could you please clarify what are the possible implications of the frequency overriding? >>> I am not even sure what the cooling device is doing here: >>> >>> tegra_tsensor_set_cur_state() is not implemented and it says hardware >>> changed it by itself. What is the benefit you are getting out of the >>> cooling device here ? >> >> It allows userspace to check whether hardware cooling is active via the >> cooling_device sysfs. Otherwise we don't have ability to check whether >> h/w cooling is active, I think it's a useful information. It's also >> interesting to see the cooling_device stats, showing how many times h/w >> mitigation was active. > > Actually the stats are for software mitigation. For the driver, create a > debugfs entry like what do the other drivers or a module parameter with > the stats. Okay >>>> The hardware limiter should let know the cpufreq framework about the >>>> frequency change. >>>> >>>> https://lkml.org/lkml/2021/6/8/1792 >>>> >>>> May be post the sensor without the hw limiter for now and address that >>>> in a separate series ? >>> >> >> I wasn't aware about existence of the thermal pressure, thank you for >> pointing at it. At a quick glance it should be possible to benefit from >> the information about the additional pressure. >> >> Seems the current thermal pressure API assumes that there is only one >> user of the API. So it's impossible to aggregate the pressure from >> different sources, like software cpufreq pressure + h/w freq pressure. >> Correct? If yes, then please let me know yours thoughts about the best >> approach of supporting the aggregation. > > That is a good question. IMO, first step would be to call > cpufreq_update_limits(). Right > [ Cc Thara who implemented the thermal pressure ] > > May be Thara has an idea about how to aggregate both? There is another > series floating around with hardware limiter [1] and the same problematic. > > [1] https://lkml.org/lkml/2021/6/8/1791 Thanks, it indeed looks similar. I guess the common thermal pressure update code could be moved out into a new special cpufreq thermal QoS handler (policy->thermal_constraints), where handler will select the frequency constraint and set up the pressure accordingly. So there won't be any races in the code.