Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -> 2.6.28

Ingo Molnar <mingo@xxxxxxx> · Mon, 17 Nov 2008 17:11:35 +0100

* Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:

>> It all looks like pure old-fashioned straight overhead in the 
>> networking layer to me. Do we still touch the same global cacheline 
>> for every localhost packet we process? Anything like that would 
>> show up big time.
>
> Yes we do, I find strange we dont see dst_release() in your NMI 
> profile
>
> I posted a patch ( commit 5635c10d976716ef47ae441998aeae144c7e7387 
> net: make sure struct dst_entry refcount is aligned on 64 bytes) (in 
> net-next-2.6 tree) to properly align struct dst_entry refcounter and 
> got 4% speedup on tbench on my machine.

Ouch, +4% from a oneliner networking change? That's a _huge_ speedup 
compared to the things we were after in scheduler land. A lot of 
scheduler folks worked hard to squeeze the last 1-2% out of the 
scheduler fastpath (which was not trivial at all). The _full_ 
scheduler accounts for only about 7% of the total system overhead here 
on a 16-way box...

So why should we be handling this anything but a plain networking 
performance regression/weakness? The localhost scalability bottleneck 
has been reported a _long_ time ago.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -&gt; 2.6.28

Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -> 2.6.28