Re: [PATCH v3 1/1] x86,sched: On AMD EPYC set freq_max = max_boost in schedutil invariant formula

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2021-02-03 at 19:25 +0100, Rafael J. Wysocki wrote:
> [cut]
> 
> So below is a prototype of an alternative fix for the issue at hand.
> 
> I can't really test it here, because there's no _CPC in the ACPI tables of my
> test machines, so testing it would be appreciated.  However, AFAICS these
> machines are affected by the performance issue related to the scale-invariance
> when they are running acpi-cpufreq, so what we are doing here is not entirely
> sufficient.
> 
> It looks like the scale-invariance code should ask the cpufreq driver about
> the maximum frequency and note that cpufreq drivers may be changed on the
> fly.
> 
> What the patch below does is to add an extra entry to the frequency table for
> each CPU to represent the maximum "boost" frequency, so as to cause that
> frequency to be used as cpuinfo.max_freq.
> 
> The reason why I think it is better to extend the frequency tables instead
> of simply increasing the frequency for the "P0" entry is because the latter
> may cause "turbo" frequency to be asked for less often.
> 
> ---
>  drivers/cpufreq/acpi-cpufreq.c |  107 ++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 95 insertions(+), 12 deletions(-)

Hello Rafael,

thanks for looking at this. Your patch is indeed cleaner than the one I proposed.

Preliminary testing is favorable; more tests are running.

Results from your patch are in the fourth column below; the performance from
v5.10 looks restored.

I'll follow up once the tests I queued are completed.

Giovanni


TEST        : Intel Open Image Denoise, www.openimagedenoise.org
INVOCATION  : ./denoise -hdr memorial.pfm -out out.pfm -bench 200 -threads $NTHREADS
CPU         : MODEL            : 2x AMD EPYC 7742
              FREQUENCY TABLE  : P2: 1.50 GHz
                                 P1: 2.00 GHz
				 P0: 2.25 GHz
              MAX BOOST        :     3.40 GHz

Results: threads, msecs (ratio). Lower is better.

               v5.10          v5.11-rc4   v5.11-rc4-ggherdov v5.11-rc6-rafael
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      1   1069.85 (1.00)   1071.84 (1.00)   1070.42 (1.00)   1069.12 (1.00)
      2    542.24 (1.00)    544.40 (1.00)    544.48 (1.00)    540.81 (1.00)
      4    278.00 (1.00)    278.44 (1.00)    277.72 (1.00)    277.79 (1.00)
      8    149.81 (1.00)    149.61 (1.00)    149.87 (1.00)    149.51 (1.00)
     16     79.01 (1.00)     79.31 (1.00)     78.94 (1.00)     79.02 (1.00)
     24     58.01 (1.00)     58.51 (1.01)     58.15 (1.00)     57.84 (1.00)
     32     46.58 (1.00)     48.30 (1.04)     46.66 (1.00)     46.70 (1.00)
     48     37.29 (1.00)     51.29 (1.38)     37.27 (1.00)     38.10 (1.02)
     64     34.01 (1.00)     49.59 (1.46)     33.71 (0.99)     34.51 (1.01)
     80     31.09 (1.00)     44.27 (1.42)     31.33 (1.01)     31.11 (1.00)
     96     28.56 (1.00)     40.82 (1.43)     28.47 (1.00)     28.65 (1.00)
    112     28.09 (1.00)     40.06 (1.43)     28.63 (1.02)     28.38 (1.01)
    120     28.73 (1.00)     39.78 (1.38)     28.14 (0.98)     28.16 (0.98)
    128     28.93 (1.00)     39.60 (1.37)     29.38 (1.02)     28.55 (0.99)



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux