On Tue, Jul 6, 2021 at 7:15 PM Yassine Oudjana <y.oudjana@xxxxxxxxxxxxxx> wrote: > > (the numactl command helps run this both on the 'big' and 'little' > > cores without running into migration) > > > > Arnd > > Here are the results: Thanks, that was quick > $ numactl -C 0 line -M 1M > 128 > $ numactl -C 3 line -M 1M > 128 > $ numactl -C 0 cache > L1 cache: 512 bytes 1.37 nanoseconds 64 linesize -1.00 parallelism > L2 cache: 24576 bytes 2.75 nanoseconds 64 linesize 5.06 parallelism > L3 cache: 131072 bytes 7.89 nanoseconds 64 linesize 3.85 parallelism > L4 cache: 524288 bytes 15.86 nanoseconds 128 linesize 3.48 parallelism > Memory latency: 145.93 nanoseconds 4.88 parallelism > $ numactl -C 3 cache > L1 cache: 24576 bytes 1.29 nanoseconds 64 linesize 5.00 parallelism > L2 cache: 1048576 bytes 8.60 nanoseconds 128 linesize 3.07 parallelism > Memory latency: 143.29 nanoseconds 5.37 parallelism This is still somewhat inconclusive, but it does give some hope. The data that I found on random web sites was - 32KB L1, 2MB/1MB L2 [1][2] - 16KB L1, 1.5MB L2 [3] - 32KB L1, 1MB/512KB L2 [4] so none of the sizes really line up. My best guess is that the actual hierarchy 1MB per-core L2 cache on the two big CPU, 512KB per-core L2 cache on the two little ones, but no shared L2 or L3. The older Krait had a 4KB L0 cache, which could explain the 512-byte L1 output. Can you rerun the the 'line' test with '-M 128K' to see if that confirms the 64 byte L1 line size that the 'cache' test reported? Arnd [1] https://en.wikipedia.org/wiki/List_of_Qualcomm_Snapdragon_processors#Snapdragon_820_and_821_(2016) [2] https://en.wikipedia.org/wiki/Kryo [3] https://www.geektopia.es/es/product/qualcomm/snapdragon-820/ [4] https://www.anandtech.com/show/9837/snapdragon-820-preview/2