16.06.2021 05:50, Thara Gopinath пишет: ... > > Hi, > > Thermal pressure is letting scheduler know that the max capacity > available for a cpu to schedule tasks is reduced due to a thermal event. > So you cannot have a h/w thermal pressure and s/w thermal pressure. > There is eventually only one capping applied at h/w level and the > frequency corresponding to this capping should be used for thermal > pressure. > > Ideally you should not be having both s/w and h/w trying to throttle at > the same time. Why is this a scenario and what prevents you from > disabling s/w throttling when h/w throttling is enabled. Now if there > has to a aggregation for whatever reason this should be done at the > thermal driver level and passed to scheduler. Hello, The h/w mitigation is much more reactive than software, in the same time it's much less flexible than software. It should provide additional protection in a cases where software isn't doing a good job. Ideally h/w mitigation should stay inactive all the time, nevertheless it should be modeled properly by the driver. >>> >>> That is a good question. IMO, first step would be to call >>> cpufreq_update_limits(). >> >> Right >> >>> [ Cc Thara who implemented the thermal pressure ] >>> >>> May be Thara has an idea about how to aggregate both? There is another >>> series floating around with hardware limiter [1] and the same >>> problematic. >>> >>> [1] https://lkml.org/lkml/2021/6/8/1791 >> >> Thanks, it indeed looks similar. >> >> I guess the common thermal pressure update code could be moved out into >> a new special cpufreq thermal QoS handler (policy->thermal_constraints), >> where handler will select the frequency constraint and set up the >> pressure accordingly. So there won't be any races in the code. >> > It was a conscious decision to keep thermal pressure update out of qos > max freq update because there are platforms that don't use the qos > framework. For eg acpi uses cpufreq_update_policy. > But you are right. We have two platforms now applying h/w throttling and > cpufreq_cooling applying s/w throttling. So it does make sense to have > one api doing all the computation to update thermal pressure. I am not > sure how exactly/where exactly this will reside. The generic cpufreq_cooling already uses QoS for limiting the CPU frequency. It could be okay to use QoS for the OF drivers, this needs a closer look. We have the case where CPU frequency is changed by the thermal event and the thermal pressure equation is the same for both s/w cpufreq_cooling and h/w thermal driver. The pressure is calculated based on the QoS cpufreq constraint that is already aggregated. Hence what we may need to do on the thermal event is: 1. Update the QoS request 2. Update the thermal pressure 3. Ensure that updates are not racing > So for starters, I think you should replicate the update of thermal > pressure in your h/w driver when you know that h/w is > throttling/throttled the frequency. You can refer to cpufreq_cooling.c > to see how it is done. > > Moving to a common api can be done as a separate patch series. > Thank you for the clarification and suggestion.