Re: [PATCH V2 7/7] x86,rcu: use percpu rcu_preempt_depth

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019-11-04 19:41:20 [+0800], Lai Jiangshan wrote:
> > Is there a benchmark saying how much we gain from this?
> 
> Hello
> 
> Maybe I can write a tight loop for testing, but I don't
> think anyone will be interesting in it.
> 
> I'm also trying to find some good real tests. I need
> some suggestions here.

There is rcutorture but I don't know how much of performance test this
is, Paul would know.

A micro benchmark is one thing. Any visible changes in userland to
workloads like building a kernel or hackbench? 

I don't argue that incrementing a per-CPU variable is more efficient
than reading a per-CPU variable, adding an offset and then incrementing
it. I was just curious to see if there are any numbers on it.

> > > No function call when using rcu_read_[un]lock().
> > > Single instruction for rcu_read_lock().
> > > 2 instructions for fast path of rcu_read_unlock().
> > 
> > I think these were not inlined due to the header requirements.
> 
> objdump -D -S kernel/workqueue.o shows (selected fractions):

That was not what I meant. To inline current rcu_read_lock() would mean
to include definition for struct task_struct (and everything down the
road) in the rcu headers which isn't working.

> Best regards
> Lai

Sebastian



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux