Re: AW: RAID456 direct I/O write performance

Ethan Wilson <ethan.wilson@xxxxxxxxxxxxx> · Thu, 04 Sep 2014 23:12:44 +0200

On 04/09/2014 18:30, Markus Stockhausen wrote:
A perf record of the 1 writer test gives:

     38.40%      swapper  [kernel.kallsyms]   [k] default_idle
     13.14%    md0_raid5  [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore
     13.05%      swapper  [kernel.kallsyms]   [k] tick_nohz_idle_enter
     10.01%          iot  [raid456]           [k] raid5_unplug
      9.06%      swapper  [kernel.kallsyms]   [k] tick_nohz_idle_exit
      3.39%    md0_raid5  [kernel.kallsyms]   [k] __kernel_fpu_begin
      1.67%    md0_raid5  [xor]               [k] xor_sse_2_pf64
      0.87%          iot  [kernel.kallsyms]   [k] finish_task_switch

I'm confused and clueless. Especially I cannot see where the
10% overhead in the source of raid5_unplug might come
from? Any idea from someone with better insight?

I am no kernel developer but I have read that the CPU time for serving 
interrupts is often accounted to the random process which has the bad 
luck to be running at the time the interrupt comes and steals the CPU. I 
read this for top, htop etc, which have probably a different accounting 
mechanism than perf, but maybe something similar happens here, because 
_raw_spin_unlock_irqrestore at 13% looks too absurd to me.
In fact, probably as soon as the interrupts are re-enabled by 
_raw_spin_unlock_irqrestore, the CPU often goes serving one interrupt 
that was queued, and this is before the function 
_raw_spin_unlock_irqrestore exits, so the time is really accounted there 
and that's why it's so high.

OTOH I would like to ask kernel experts one thing if I may: does anybody 
know a way to get a stack trace for a process which is currently running 
in kernel mode and is running NOW on a CPU and it is not stopped waiting 
in a queue? I know about /proc/pid/stack but that one shows 
0xffffffffffffffff for such a case. Being able to do that would help to 
answer the above question too...

Thanks
EW

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html