cache-shift vs x86_generic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



folks,

in arch/i386/Kconfig, we have a situation where X86_GENERIC has undue influence
on X86_L1_CACHE_SHIFT;

config X86_L1_CACHE_SHIFT
      int
      default "7" if MPENTIUM4 || X86_GENERIC
      default "4" if X86_ELAN || M486 || M386
default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODEGX1
      default "6" if MK7 || MK8 || MPENTIUMM

that is, when X86_GENERIC == true --> default = 7,
ignoring the platform choice *made* by the user-builder.


I have a Geode based (soekris 4801) board, Ive built kernel
both ways (generic=on/off), and ran lmbench against them. I saw some potentially interesting differences in the results files,
but I dont know how to interpret them. Heres a few snippets.

< [lmbench3.0 results for Linux soekris 2.6.13-ski2-cache-v1 #3 Fri Sep 23 13:14:30 MDT 2005 i586 GNU/Linux]
---
> [lmbench3.0 results for Linux soekris 2.6.13-ski2-v1 #1 Fri Sep 23 13:24:45 MDT 2005 i586 GNU/Linux]
36,39c36,39
< [RELEASE: 2.6.13-ski2-cache-v1]
< [VERSION: #3 Fri Sep 23 13:14:30 MDT 2005]
< [Mon Sep 26 11:16:58 MDT 2005]
< [ 11:16:58 up 2:35, 1 user, load average: 0.60, 0.17, 0.05]
---
> [RELEASE: 2.6.13-ski2-v1]
> [VERSION: #1 Fri Sep 23 13:24:45 MDT 2005]
> [Mon Sep 26 12:25:47 MDT 2005]
> [ 12:25:47 up 3 min, 1 user, load average: 0.60, 0.29, 0.11]
44c44
< [net: eth0 1500 0 100432 0 0 0 81127 0 0 0 BMRU]
---
> [net: eth0 1500 0 40160 0 0 0 25721 0 0 0 BMRU]
48,49c48,49
< [if: RX packets:100579 errors:0 dropped:0 overruns:0 frame:0]
< [if: TX packets:81243 errors:0 dropped:0 overruns:0 carrier:0]
---
> [if: RX packets:40255 errors:0 dropped:0 overruns:0 frame:0]
> [if: TX packets:25817 errors:0 dropped:0 overruns:0 carrier:0]
51c51
< [if: RX bytes:48422020 (46.1 MiB)  TX bytes:13023302 (12.4 MiB)]
---
> [if: RX bytes:31615232 (30.1 MiB)  TX bytes:4460702 (4.2 MiB)]
64c64
< [net: lo 16436 0 84 0 0 0 84 0 0 0 LRU]
---
> [net: lo 16436 0 36 0 0 0 36 0 0 0 LRU]
68,69c68,69
< [if: RX packets:84 errors:0 dropped:0 overruns:0 frame:0]
< [if: TX packets:84 errors:0 dropped:0 overruns:0 carrier:0]
---
> [if: RX packets:36 errors:0 dropped:0 overruns:0 frame:0]
> [if: TX packets:36 errors:0 dropped:0 overruns:0 carrier:0]
71c71
< [if: RX bytes:12056 (11.7 KiB)  TX bytes:12056 (11.7 KiB)]
---
> [if: RX bytes:2844 (2.7 KiB)  TX bytes:2844 (2.7 KiB)]


Those are some significant differences if theyre thruput numbers.
Below, the timing seems to agree with the better performance
of the -cache kernel

RX bytes:2844 (2.7 KiB)  TX bytes:2844 (2.7 KiB)]
86,107c86,107
< Simple syscall: 1.6462 microseconds
< Simple read: 5.3041 microseconds
< Simple write: 4.6366 microseconds
< Simple stat: 223.7200 microseconds
< Simple fstat: 8.6939 microseconds
< Simple open/close: 2535.0000 microseconds
< Select on 10 fd's: 13.8254 microseconds
< Select on 100 fd's: 110.5490 microseconds
< Select on 250 fd's: 231.7619 microseconds
< Select on 500 fd's: 550.9000 microseconds
< Select on 10 tcp fd's: 15.3956 microseconds
< Select on 100 tcp fd's: 145.9211 microseconds
< Select on 250 tcp fd's: 371.5714 microseconds
< Select on 500 tcp fd's: 746.0000 microseconds
< Signal handler installation: 9.3942 microseconds
< Signal handler overhead: 35.6667 microseconds
< Protection fault: 1.9708 microseconds
< Pipe latency: 129.5962 microseconds
< AF_UNIX sock stream latency: 267.0952 microseconds
< Process fork+exit: 3620.0000 microseconds
< Process fork+execve: 16960.0000 microseconds
< Process fork+/bin/sh -c: 61487.0000 microseconds
---
> Simple syscall: 1.8362 microseconds
> Simple read: 8.4718 microseconds
> Simple write: 7.2812 microseconds
> Simple stat: 210.5769 microseconds
> Simple fstat: 10.1660 microseconds
> Simple open/close: 2549.3333 microseconds
> Select on 10 fd's: 13.8471 microseconds
> Select on 100 fd's: 111.6400 microseconds
> Select on 250 fd's: 232.0000 microseconds
> Select on 500 fd's: 551.7000 microseconds
> Select on 10 tcp fd's: 14.3761 microseconds
> Select on 100 tcp fd's: 149.2162 microseconds
> Select on 250 tcp fd's: 370.3571 microseconds
> Select on 500 tcp fd's: 722.3750 microseconds
> Signal handler installation: 9.8043 microseconds
> Signal handler overhead: 34.1729 microseconds
> Protection fault: 6.8015 microseconds
> Pipe latency: 132.9220 microseconds
> AF_UNIX sock stream latency: 272.5789 microseconds
> Process fork+exit: 3501.0000 microseconds
> Process fork+execve: 16546.0000 microseconds
> Process fork+/bin/sh -c: 54099.0000 microseconds


lmbench's various math thruput benchmarks didnt show any
real diffs, but these 'parallelism' numbers dont make any sense to me.


127,129c127,129
< integer add parallelism: 1.04
< integer mul parallelism: 1.11
< integer div parallelism: 1.01
---
> integer add parallelism: 1.15
> integer mul parallelism: 1.04
> integer div parallelism: 1.14
133,135c133,135
< int64 mul parallelism: 1.09
< int64 div parallelism: 1.01
< int64 mod parallelism: 1.00
---
> int64 mul parallelism: 1.00
> int64 div parallelism: 1.00
> int64 mod parallelism: 1.01
142,143c142,143


Anyway,

Q: should X86_GENERIC trump  CPU choice in determining L1_CACHE_SHIFT ?
N:
user chose the cpu, right or wrong.  follow those instructions.
GENERIC has other purposes (enabling auto-tweaks ?)
dont mess with CACHE_SHIFT on a whim, its an important factor!?
Y:
GENERIC is a backup, giving 'better' performance over range of hardware.
It must have some effect after all (what, besides affecting this choice, does it do ?)
CACHE_SHIFT is just a guess anyway, the hardware does what it does.

the argument could continue ..  can anybody illuminate the issues ?

Q: is it worth submitting as a patch ?
I have 'attachment' issues, so would prefer to test the idea b4 prepping the patch.

Q: what do these lmbench results tell an informed user ?
Q: what benchmarks are most appropriate for observing CACHE_LINE_SIZE effects ?



--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/


[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux