Re: [Bugme-new] [Bug 15618] New: 2.6.18->2.6.32->2.6.33 huge regression in performance

Anton Starikov <ant.starikov@xxxxxxxxx> · Tue, 23 Mar 2010 20:42:12 +0100

I attach here callgraph.

Also I checked kernel source, actual code which was compiled is exactly what should be after patches.

Do I miss something?

Attachment:
callg.txt.gz

Description: GNU Zip compressed data
On Mar 23, 2010, at 8:17 PM, Peter Zijlstra wrote:

> On Tue, 2010-03-23 at 20:14 +0100, Anton Starikov wrote:
>> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
>> 
>>> 
>>> 
>>> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>>>> 
>>>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>>>> overhead.
>>> 
>>> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
>>> the shit-for-brains generic version" thing, and it's fixed by
>>> 
>>> 	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
>>> 	5d0b723 x86: clean up rwsem type system
>>> 	59c33fa x86-32: clean up rwsem inline asm statements
>>> 
>>> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
>>> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
>>> compile his own kernel to test his load.
>> 
>> 
>> Applied mentioned patches. Things didn't improve too much.
>> 
>> before:
>> prog: Total exploration time 9.880 real 60.620 user 76.970 sys
>> 
>> after:
>> prog: Total exploration time 9.020 real 59.430 user 66.190 sys
>> 
>> perf report:
>> 
>>    38.58%             prog  [kernel]                                           [k] _spin_lock_irqsave
>>    37.42%             prog  ./prog                                             [.] DBSLLlookup_ret
>>     6.22%             prog  ./prog                                             [.] SuperFastHash
>>     3.65%             prog  /lib64/libc-2.11.1.so                              [.] __GI_memcpy
>>     2.09%             prog  ./anderson.6.dve2C                                 [.] get_successors
>>     1.75%             prog  [kernel]                                           [k] clear_page_c
>>     1.73%             prog  ./prog                                             [.] index_next_dfs
>>     0.71%             prog  [kernel]                                           [k] handle_mm_fault
>>     0.38%             prog  ./prog                                             [.] cb_hook
>>     0.33%             prog  ./prog                                             [.] get_local
>>     0.32%             prog  [kernel]                                           [k] page_fault
> 
> Could you verify with a callgraph profile what that spin_lock_irqsave()
> is? If those rwsem patches were successfull mmap_sem should no longer
> have a spinlock to content on, in which case it might be another lock.
> 
> If not, something went wrong with backporting those patches.