Re: mempool and cacheline ping pong

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Loïc,
Per our email discussion, I'm happy to help.  If you or anyone else can run perf c2c, I will analyze the results and reply back with the findings.  

The perf c2c output is a bit non-intuitive, but it conveys a lot.  I'm happy to share the findings.

Here's what I recommend:
 1) Get on an Intel system where you're pushing Ceph really hard. (AMD uses different low level perf events that haven't been ported over yet.)
 2) Make sure the Ceph code you're running has debug info in it and isn't stripped. 
 3) This needs to be run on bare-metal.  The PEBS perf events used by c2c are not supported in a virtualized guest (Intel says support is coming in newer cpus).
 3) As any fyi, the less cpu pinning you do, the more cacheline contention c2c will expose. 
 4) Once you run the commands that I've appended below (as root), then tar up everything, data files and all, and lftp them to the location below:

    $ lftp dropbox.redhat.com
    > cd /incoming
    > put unique-filename

Please let me know the name of the files that you uploaded after you put them there.  I'll grab them.  
I just joined this list and I don't know if I'll get notified of the replies, so send me email when the files are there for me to grab.

Does that sound OK?
Holler if you have any questions.
Joe

    # First get some background system info 
    uname -a > uname.out
    lscpu > lscpu.out
    cat /proc/cmdline > cmdline.out
    timeout -s INT 10 vmstat -w 1 > vmstat.out

    nodecnt=`lscpu|grep "NUMA node(" |awk '{print $3}'`
    for ((i=0; i<$nodecnt; i++))
    do
       cat /sys/devices/system/node/node${i}/meminfo > meminfo.$i.out
    done
    more `find /proc -name status` > proc_parent_child_status.out
    more /proc/*/numa_maps > numa_maps.out
    
    #
    # Get separate kernel and user perf-c2c stats
    #
    perf c2c record -a --ldlat=70 --all-user -o perf_c2c_a_all_user.data sleep 5 
    perf c2c report --stdio -i perf_c2c_a_all_user.data > perf_c2c_a_all_user.out 2>&1
    perf c2c report --full-symbols --stdio -i perf_c2c_a_all_user.data > perf_c2c_full-sym_a_all_user.out 2>&1

    perf c2c record -g -a --ldlat=70 --all-user -o perf_c2c_g_a_all_user.data sleep 5 
    perf c2c report -g --stdio -i perf_c2c_g_a_all_user.data > perf_c2c_g_a_all_user.out 2>&1

    perf c2c record -a --ldlat=70 --all-kernel -o perf_c2c_a_all_kernel.data sleep 4 
    perf c2c report --stdio -i perf_c2c_a_all_kernel.data > perf_c2c_a_all_kernel.out 2>&1

    perf c2c record -g --ldlat=70 -a --all-kernel -o perf_c2c_g_a_all_kernel.data sleep 4 
    perf c2c report -g --stdio -i perf_c2c_g_a_all_kernel.data > perf_c2c_g_a_all_kernel.out 2>&1

    #
    # Get combined kernel and user perf-c2c stats
    #
    perf c2c record -a --ldlat=70 -o perf_c2c_a_both.data sleep 4 
    perf c2c report --stdio -i perf_c2c_a_both.data > perf_c2c_a_both.out 2>&1

    perf c2c record -g --ldlat=70 -a --all-kernel -o perf_c2c_g_a_both.data sleep 4 
    perf c2c report -g --stdio -i perf_c2c_g_a_both.data > perf_c2c_g_a_both.out 2>&1

    #
    # Get all-user physical addr stats, in case multiple threads or processes are 
    # accessing shared memory with different vaddrs.
    #
    perf c2c record --phys-data -a --ldlat=70 --all-user -o perf_c2c_a_all_user_phys_data.data sleep 5 
    perf c2c report --stdio -i perf_c2c_a_all_user_phys_data.data > perf_c2c_a_all_user_phys_data.out 2>&1
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx




[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux