On Fri, Sep 28, 2018 at 02:24:40PM -0700, Matthew Wilcox wrote: > > > > I think the mistake here is the misleading use of srcu_dereference(): > > > > > > > > srcu_dereference( > > > > *slot, > > > > &pbundle->bundle.ufile->device->disassociate_srcu); > > > > > > > > That should just be rcu_dereference_protected(*slot, true), I think? > > > > > > aka rcu_dereference_raw()? Yes, that's probably better. > > > > I've been following the docs in Documentation/RCU/lockdep.txt: > > > > rcu_dereference_raw(p): > > Don't check. (Use sparingly, if at all.) > > rcu_dereference_protected(p, c): > > Use explicit check expression "c", and omit all barriers > > and compiler constraints. This is useful when the data > > structure cannot change, for example, in code that is > > invoked only by updaters. > > > > In this case 'the data structure cannot change' as we are protected > > from that by kref. > > > > Do you know if the docs are wrong now and _raw() is now the right > > choice? > > This is further down the rcu rabbithole than I'm comfortable with. > I see _raw as "I know what I'm doing" ... I don't think anyone foresaw > people using 'true' as the condition to _protected. I think they're > equivalent and '_protected(x, true)' is just a more verbose way of > spelling '_raw(x)'. They are not the same.. _raw has barriers, while _protected does not. ie _raw is the same as rcu_dereference except that it doesn't lockdep check the caller is in a RCU critical region. It looks like _protected is supposed to be used if the caller has excluded writers (ie this case), and since we are excluding by control flow not locking the 'true' is appropriate. 'git grep' shows it is pretty common to pass true here, maybe even the most common usage.. > I wasn't expecting current users of the radix tree to help ;-) And I was > getting a head start on some of the harder conversions so I could send > them to maintainers early. But yes, my plan is to send the conversions > to maintainers for merging after the base XArray code lands, not try and > mass-convert radix trees to xarrays through my own tree. Sure, let me know, I can help you out with rdma. > Meanwhile I've come up with another problem which is that the plan is to > make xarray nodes movable. That means that *even if the xarray node is > pinned* by your own code, the xarray may decide to relocate the entries > if you're not holding the RCU read lock or the xarray lock. This is Christoph's work? Ah, I see, xarray has to hold a RCU critical section all over the place and radix tree did not.. Hum, I think uverbs is OK with that except for this one case. > Do you really need the hint? For a three level raidx tree, that's only > 6 additional cachelines. Prmtr optmstn s th rt f ll vl. Sounds like it is some SMP barriers now that xarray requires RCU as well as some function calls and atomics? Looped 10-20 times per system call :( already hot cache lines seem like the least worry here. The system call path is one of the things we consider a fast path in the subsystem so care has been taken to keep things out of the way. This code got a nice 6% bump when this was moved from the open-coded tree scheme to radix tree, in part because we get this nice fast 3 level lookup, compared to the deeper and very branch-heavy version that was before. However this code is still being renovated and we haven't yet put the performance critical stuff into ioctl. So if the xarray can't handle this then efficiently please just replace with a full lookup. We can benchmark again and if it is bad we can can burn some memory to pre-compute equivalent linear lookup tables under uverbs's control so the usage don't trouble xarray. Jason