On Wed, May 06, 2020 at 03:36:54PM -0700, Alexander Duyck wrote: > On Wed, May 6, 2020 at 3:21 PM Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> wrote: > > > > On Tue, May 05, 2020 at 07:55:43AM -0700, Alexander Duyck wrote: > > > One question about this data. What is the power management > > > configuration on the systems when you are running these tests? I'm > > > just curious if CPU frequency scaling, C states, and turbo are > > > enabled? > > > > Yes, intel_pstate is loaded in active mode without hwp and with turbo enabled > > (those power management docs are great by the way!) and intel_idle is in use > > too. > > > > > I ask because that is what I have seen usually make the > > > difference in these kind of workloads as the throughput starts > > > dropping off as you start seeing the core frequency lower and more > > > cores become active. > > > > If I follow, you're saying there's a chance performance would improve with the > > above disabled, but how often would a system be configured that way? Even if > > it were faster, the machine is configured how it's configured, or am I missing > > your point? > > I think you might be missing my point. What I was getting at is that I > know for performance testing sometimes C states and P states get > disabled in order to get consistent results between runs, it sounds > like you have them enabled though. I was just wondering if you had > disabled them or not. If they were disabled then you wouldn't get the > benefits of turbo and as such adding more cores wouldn't come at a > penalty, while with it enabled the first few cores should start to > slow down as they fell out of turbo mode. So it may be part of the > reason why you are only hitting about 10x at full core count. All right, that makes way more sense. > As it stands I think your code may speed up a bit if you split the > work up based on section instead of max order. That would get rid of > any cache bouncing you may be doing on the pageblock flags and reduce > the overhead for splitting the work up into individual pieces since > each piece will be bigger. See my other mail.