Re: NUMA balancing degrading performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Martin,

I had some time to run NPB.

LU-HP is not available in NPB 3.3.1, so I used LU instead.

Here are my results:

LU Benchmark Completed.
 Class           =                        C
 Size            =            162x 162x 162
 Iterations      =                      250
 Time in seconds =                    XX.XX
 Total threads   =                       80
 Avail threads   =                       80
 Mop/s total     =                 51420.84
 Mop/s/thread    =                   642.76
 Operation type  =           floating point
 Verification    =               SUCCESSFUL
 Version         =                    3.3.1
 Compile date    =              28 Oct 2014

 Compile options:
    F77          = gfortran
    FLINK        = $(F77)
    F_LIB        = (none)
    F_INC        = (none)
    FFLAGS       = -O3 -fopenmp -mcmodel=medium
    FLINKFLAGS   = -O3 -fopenmp
    RAND         = (none)

With numa balance disabled :
(sudo bash -c "echo 0 > /proc/sys/kernel/numa_balancing"):

1st run:  Time in seconds =                    39.65
2nd run:  Time in seconds =                    39.47
3rd run:  Time in seconds =                    41.31
4th run:  Time in seconds =                    40.42

The measurements without numa balance are stable and around 40 sec.

With numa balance enabled
(sudo bash -c "echo 1 > /proc/sys/kernel/numa_balancing"):

1st run: Time in seconds =                    53.89
2nd run: Time in seconds =                    51.95
3rd run: Time in seconds =                    56.22
4th run: Time in seconds =                    64.20

Enabling this option increases the runtime by more then 50 % in the worst case.

Here is some information about the hardware:

Kernel: Linux inwest 3.16.4-1-ARCH #1 SMP PREEMPT Mon Oct 6 08:22:27
CEST 2014 x86_64 GNU/Linux
CPU: Intel(R) Xeon(R) CPU E7- 4850  @ 2.00GHz

numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 40 41 42 43 44 45 46 47 48 49
node 0 size: 64427 MB
node 0 free: 63912 MB
node 1 cpus: 10 11 12 13 14 15 16 17 18 19 50 51 52 53 54 55 56 57 58 59
node 1 size: 64509 MB
node 1 free: 64066 MB
node 2 cpus: 20 21 22 23 24 25 26 27 28 29 60 61 62 63 64 65 66 67 68 69
node 2 size: 64509 MB
node 2 free: 63987 MB
node 3 cpus: 30 31 32 33 34 35 36 37 38 39 70 71 72 73 74 75 76 77 78 79
node 3 size: 64509 MB
node 3 free: 64035 MB
node distances:
node   0   1   2   3
  0:  10  21  21  21
  1:  21  10  21  21
  2:  21  21  10  21
  3:  21  21  21  10

Regards,
Andreas

2014-10-27 22:27 GMT+01:00 Martin Ichilevici de Oliveira
<iomartin@xxxxxxxxxxxx>:
> Hello Andreas,
>
> Thank you for your reply. Please check my comments inline.
>
>> it would be good to know which applications/benchmarks you were running.
>>
>> Have you tried out some well known and open source benchmarks?
>>
>> NAS Parallel Benchmarks -
>> http://www.nas.nasa.gov/publications/npb.html (Fortran Code)
>> NPB2.3-omp-C.tgz (C version NPB in OpenMP) -
>> http://www.hpcs.cs.tsukuba.ac.jp/omni-compiler/download/NPB2.3-omp-C.tgz
>> Stream - http://www.cs.virginia.edu/stream/FTP/Code/stream.c
>
> Sorry, I should have mentioned that. I tried some NAS benchmarks:
> bt, sp and lu-hp. bt and sp were around 60% slower with the balancing
> turned on, and lu-hp was 10 times slower.
>
> I also ran Lulesh, which was roughly 100% slower with the balancing
> turned on.
>
>> Do you have "numad" running on your machine? If it is running you
>> should stop it.
>
> I checked and it's not running.
>
> Cheers,
> Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-numa" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]     [Devices]

  Powered by Linux