On Thu, Aug 17, 2017 at 02:51:12PM +0300, Nikolay Aleksandrov wrote: > On 17/08/17 14:39, Nikolay Aleksandrov wrote: > > On 17/08/17 14:03, David Lamparter wrote: > >> On Wed, Aug 16, 2017 at 11:38:06PM +0300, Nikolay Aleksandrov wrote: [cut] > >>> and hitting the fast path for everyone in a few different places for a > >>> feature that the majority will not use does not sound acceptable to > >>> me. We've been trying hard to optimize it, trying to avoid additional > >>> cache lines, removing tests and keeping special cases to a minimum. > >> > >> skb->dst is on the same cacheline as skb->len. > >> fdb->md_dst is on the same cacheline as fdb->dst. > >> Both will be 0 in a lot of cases, so this should be two null checks on > >> data that is hot in the cache. Are you sure this is an actual problem? > >> > > > > Sure - no, I haven't benchmarked it, but I don't see skb->len being on > > the same cache line as _skb_refdst assuming 64 byte cache lines. > > I should've been clearer - that obviously depends on the kernel config, but > in order for them to be in the same line you need to disable either one of > conntrack, bridge_netfilter or xfrm, these are almost always enabled (at > least in all major distributions). Did I miscount somewhere? This is what I counted: offs size 00 16 next/prev/other union bits 16 8 sk 24 8 dev 32 32 cb (first 32 bytes) ---- boundary @ 64 64 16 cb (last 16 bytes) 80 8 _skb_refdst 88 8 destructor 96 8 (0) sp 104 8 (0) _nfct 112 8 (0) nf_bridge 120 4 len 124 4 data_len ---- boundary @ 128 128 2 mac_len 130 2 hdr_len -David