On Tue, Feb 14, 2017 at 4:12 AM, Jesper Dangaard Brouer <brouer@xxxxxxxxxx> wrote: > It is important to understand that there are two cases for the cost of > an atomic op, which depend on the cache-coherency state of the > cacheline. > > Measured on Skylake CPU i7-6700K CPU @ 4.00GHz > > (1) Local CPU atomic op : 27 cycles(tsc) 6.776 ns > (2) Remote CPU atomic op: 260 cycles(tsc) 64.964 ns > Okay, it seems you guys really want a patch that I said was not giving good results Let me publish the numbers I get , adding or not the last (and not official) patch. If I _force_ the user space process to run on the other node, then the results are not the ones Alex or you are expecting. I have with this patch about 2.7 Mpps of this silly single TCP flow, and 3.5 Mpps without it. lpaa24:~# sar -n DEV 1 10 | grep eth0 | grep Ave Average: eth0 2699243.20 16663.70 1354783.36 1079.95 0.00 0.00 4.50 Profile of the cpu on NUMA node 1 ( netserver consuming data ) : 54.73% [kernel] [k] copy_user_enhanced_fast_string 31.07% [kernel] [k] skb_release_data 4.24% [kernel] [k] skb_copy_datagram_iter 1.35% [kernel] [k] copy_page_to_iter 0.98% [kernel] [k] _raw_spin_lock 0.90% [kernel] [k] skb_release_head_state 0.60% [kernel] [k] tcp_transmit_skb 0.51% [kernel] [k] mlx4_en_xmit 0.33% [kernel] [k] ___cache_free 0.28% [kernel] [k] tcp_rcv_established Profile of cpu handling mlx4 softirqs (NUMA node 0) 48.00% [kernel] [k] mlx4_en_process_rx_cq 12.92% [kernel] [k] napi_gro_frags 7.28% [kernel] [k] inet_gro_receive 7.17% [kernel] [k] tcp_gro_receive 5.10% [kernel] [k] dev_gro_receive 4.87% [kernel] [k] skb_gro_receive 2.45% [kernel] [k] mlx4_en_prepare_rx_desc 2.04% [kernel] [k] __build_skb 1.02% [kernel] [k] napi_reuse_skb.isra.95 1.01% [kernel] [k] tcp4_gro_receive 0.65% [kernel] [k] kmem_cache_alloc 0.45% [kernel] [k] _raw_spin_lock Without the latest patch (the exact patch series v3 I submitted), thus with this atomic_inc() in mlx4_en_process_rx_cq instead of only reads. lpaa24:~# sar -n DEV 1 10|grep eth0|grep Ave Average: eth0 3566768.50 25638.60 1790345.69 1663.51 0.00 0.00 4.50 Profiles of the two cpus : 74.85% [kernel] [k] copy_user_enhanced_fast_string 6.42% [kernel] [k] skb_release_data 5.65% [kernel] [k] skb_copy_datagram_iter 1.83% [kernel] [k] copy_page_to_iter 1.59% [kernel] [k] _raw_spin_lock 1.48% [kernel] [k] skb_release_head_state 0.72% [kernel] [k] tcp_transmit_skb 0.68% [kernel] [k] mlx4_en_xmit 0.43% [kernel] [k] page_frag_free 0.38% [kernel] [k] ___cache_free 0.37% [kernel] [k] tcp_established_options 0.37% [kernel] [k] __ip_local_out 37.98% [kernel] [k] mlx4_en_process_rx_cq 26.47% [kernel] [k] napi_gro_frags 7.02% [kernel] [k] inet_gro_receive 5.89% [kernel] [k] tcp_gro_receive 5.17% [kernel] [k] dev_gro_receive 4.80% [kernel] [k] skb_gro_receive 2.61% [kernel] [k] __build_skb 2.45% [kernel] [k] mlx4_en_prepare_rx_desc 1.59% [kernel] [k] napi_reuse_skb.isra.95 0.95% [kernel] [k] tcp4_gro_receive 0.51% [kernel] [k] kmem_cache_alloc 0.42% [kernel] [k] __inet_lookup_established 0.34% [kernel] [k] swiotlb_sync_single_for_cpu So probably this will need further analysis, outside of the scope of this patch series. Could we now please Ack this v3 and merge it ? Thanks. > Notice the huge difference. And in case 2, it is enough that the remote > CPU reads the cacheline and brings it into "Shared" (MESI) state, and > the local CPU then does the atomic op. > > One key ideas behind the page_pool, is that remote CPUs read/detect > refcnt==1 (Shared-state), and store the page in a small per-CPU array. > When array is full, it gets bulk returned to the shared-ptr-ring pool. > When "local" CPU need new pages, from the shared-ptr-ring it prefetchw > during it's bulk refill, to latency-hide the MESI transitions needed. > > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>