Re: skb_release_head_state(): Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -> 2.6.28

Ingo Molnar <mingo@xxxxxxx> · Mon, 17 Nov 2008 22:38:05 +0100



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Mon, 17 Nov 2008, Ingo Molnar wrote:
> > 
> > this function _really_ hurts from a 16-bit op:
> > 
> > ffffffff8048943e:     6503 	66 c7 83 a8 00 00 00 	movw   $0x0,0xa8(%rbx)
> > ffffffff80489445:        0 	00 00 
> > ffffffff80489447:   174101 	5b                   	pop    %rbx
> 
> I don't think that is it, actually. The 16-bit store just before it 
> had a zero count, even though anything that executes the second one 
> will always execute the first one too.

yeah - look at the followup bits that identify the likely real source 
of that overhead:

>> _But_, the real overhead probably comes from:
>> 
>>  ffffffff804b7210:    10867 	48 8b 54 24 58       	mov    0x58(%rsp),%rdx
>> 
>> which is the next line, the ttl field:
>> 
>>  373             iph->ttl      = ip_select_ttl(inet, &rt->u.dst);
>> 
>> this shows that we are doing a hard cachemiss on the net-localhost 
>> route dst structure cacheline. We do a plain load instruction from 
>> it here and get a hefty cachemiss. (because 16 CPUs are banging on 
>> that single route)
>> 
>> And let make sure we see this in perspective as well: that single 
>> cachemiss is _1.0 percent_ of the total tbench cost. (!) We could 
>> make the scheduler 10% slower straight away and it would have less 
>> of a real-life effect than this single iph->ttl field setting.
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: skb_release_head_state(): Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -&gt; 2.6.28

Re: skb_release_head_state(): Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -> 2.6.28