Re: uverbs radix tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 28, 2018 at 02:24:40PM -0700, Matthew Wilcox wrote:

> > > > I think the mistake here is the misleading use of srcu_dereference():
> > > > 
> > > >    srcu_dereference(
> > > > 		*slot,
> > > > 		&pbundle->bundle.ufile->device->disassociate_srcu);
> > > > 
> > > > That should just be rcu_dereference_protected(*slot, true), I think?
> > > 
> > > aka rcu_dereference_raw()?  Yes, that's probably better.
> > 
> > I've been following the docs in Documentation/RCU/lockdep.txt:
> > 
> >         rcu_dereference_raw(p):
> >                 Don't check.  (Use sparingly, if at all.)
> >         rcu_dereference_protected(p, c):
> >                 Use explicit check expression "c", and omit all barriers
> >                 and compiler constraints.  This is useful when the data
> >                 structure cannot change, for example, in code that is
> >                 invoked only by updaters.
> > 
> > In this case 'the data structure cannot change' as we are protected
> > from that by kref.
> > 
> > Do you know if the docs are wrong now and _raw() is now the right
> > choice?
> 
> This is further down the rcu rabbithole than I'm comfortable with.
> I see _raw as "I know what I'm doing" ... I don't think anyone foresaw
> people using 'true' as the condition to _protected.  I think they're
> equivalent and '_protected(x, true)' is just a more verbose way of
> spelling '_raw(x)'.

They are not the same..

_raw has barriers, while _protected does not.

ie _raw is the same as rcu_dereference except that it doesn't lockdep
check the caller is in a RCU critical region.

It looks like _protected is supposed to be used if the caller has
excluded writers (ie this case), and since we are excluding by control
flow not locking the 'true' is appropriate. 'git grep' shows it is
pretty common to pass true here, maybe even the most common usage..

> I wasn't expecting current users of the radix tree to help ;-)  And I was
> getting a head start on some of the harder conversions so I could send
> them to maintainers early.  But yes, my plan is to send the conversions
> to maintainers for merging after the base XArray code lands, not try and
> mass-convert radix trees to xarrays through my own tree.

Sure, let me know, I can help you out with rdma.
 
> Meanwhile I've come up with another problem which is that the plan is to
> make xarray nodes movable.  That means that *even if the xarray node is
> pinned* by your own code, the xarray may decide to relocate the entries
> if you're not holding the RCU read lock or the xarray lock.

This is Christoph's work?

Ah, I see, xarray has to hold a RCU critical section all over the
place and radix tree did not.. Hum, I think uverbs is OK with that
except for this one case.

> Do you really need the hint?  For a three level raidx tree, that's only
> 6 additional cachelines.  Prmtr optmstn s th rt f ll vl.

Sounds like it is some SMP barriers now that xarray requires RCU as
well as some function calls and atomics? Looped 10-20 times per system
call :( already hot cache lines seem like the least worry here.

The system call path is one of the things we consider a fast path in
the subsystem so care has been taken to keep things out of the
way. This code got a nice 6% bump when this was moved from the
open-coded tree scheme to radix tree, in part because we get this nice
fast 3 level lookup, compared to the deeper and very branch-heavy
version that was before.

However this code is still being renovated and we haven't yet put the
performance critical stuff into ioctl. So if the xarray can't handle
this then efficiently please just replace with a full lookup.

We can benchmark again and if it is bad we can can burn some memory to
pre-compute equivalent linear lookup tables under uverbs's control so
the usage don't trouble xarray.

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux