On Thu, Aug 29, 2024 at 08:01:48AM -0500, Mario Limonciello wrote: > On 8/29/2024 07:52, Andrea Righi wrote: > > On Wed, Aug 28, 2024 at 08:27:44PM +0530, Gautham R. Shenoy wrote: > > > Hello Andrea, > > > > > > On Wed, Aug 28, 2024 at 08:20:50AM +0200, Andrea Righi wrote: > > > > On Wed, Aug 28, 2024 at 10:38:45AM +0530, Gautham R. Shenoy wrote: > > > > ... > > > > > > I had thought this was a malfunction in the behavior that it reflected the > > > > > > current status, not the hardware /capability/. > > > > > > > > > > > > Which one makes more sense for userspace? In my mind the most likely > > > > > > consumer of this information would be something a sched_ext based userspace > > > > > > scheduler. They would need to know whether the scheduler was using > > > > > > preferred cores; not whether the hardware supported it. > > > > > > > > > > The commandline parameter currently impacts only the fair sched-class > > > > > tasks since the preference information gets used only during > > > > > load-balancing. > > > > > > > > > > IMO, the same should continue with sched-ext, i.e. if the user has > > > > > explicitly disabled prefcore support via commandline, the no sched-ext > > > > > scheduler should use the preference information to make task placement > > > > > decisions. However, I would like to see what the sched-ext folks have > > > > > to say. Adding some of them to the Cc list. > > > > > > > > IMHO it makes more sense to reflect the real state of prefcore support > > > > from a "system" perspective, more than a "hardware" perspective, so if > > > > it's disabled via boot command line it should show disabled. > > > > > > > > From a user-space scheduler perspective we should be fine either way, as > > > > long as the ABI is clearly documented, since we also have access to > > > > /proc/cmdline and we would be able to figure out if the user has > > > > disabled it via cmdline (however, the preference is still to report the > > > > actual system status). > > > > > > Thank you for confirming this. > > > > > > > > > > > Question: having prefcore enabled affects also the value of > > > > scaling_max_freq? Like an `lscpu -e`, for example, would show a higher > > > > max frequency for the specific preferred cores? (this is another useful > > > > information from a sched_ext scheduler perspective). > > > > > > Since the scaling_max_freq is computed based on the boost-numerator, > > > at least from this patchset, the numerator would be the same across > > > all kinds of cores, and thus the scaling_max_freq reported will be the > > > same across all the cores. > > > > I see, so IIUC from user-space the most reliable way to detect the > > fastest cores is to check amd_pstate_highest_perf / amd_pstate_max_freq, > > right? I'm trying to figure out a way to abstract and generalize the > > concept of "fast cores" in sched_ext. > > Right now the best way to do this is to look at the > amd_pstate_precore_ranking file. Ok. > > In this series there has been some discussion of dropping it though in favor > of looking at the highest perf file. I don't believe we're concluded one > way or another on it yet though. > > > > > Also, is this something that has changed recently? I see this on an > > AMD Ryzen Threadripper PRO 7975WX 32-Cores running a 6.8 kernel: > > > > $ uname -r > > 6.8.0-40-generic > > You're missing the preferred core patches on this kernel. They landed in > 6.9, it's better to upgrade to 6.10.y or 6.11-rc. So, if I move to 6.9+ I should see the same max frequency across all the CPUs and I can use amd_pstate_precore_ranking to determine the subset of fast cores. Thanks for the clarification. -Andrea