Re: [PATCH v2 19/20] IB/rdmavt, IB/qib, IB/hfi1: Make percpu refcount optional for user MRs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 07, 2017 at 09:12:34PM +0000, Marciniszyn, Mike wrote:
> > Umm.. This doesn't look like a refcount, it is a rwlock - why aren't you using
> > the optimized percpu_rwsem?
> > 
> 
> The refcount with a completion has been in qib and rdmavt for years
> without issue.

Doesn't change the fact this isn't a refcount behavior, it is a rwsem
with write lock on destroy. A proper refcounf would destroy the object
not call a completion.

Doing things properly using the common primitives makes stuff work
better, eg percpu_rwsem has sane lockdep.

> All this being said, we have encountered a use case where the MR is
> short lived and supports just one transaction.

Well, yes, that is a pretty common idiom in kernel workloads too..

> I have a prototype patch to pass a hint (no module parameter) to the
> user MR registration via the access flags.

Okay, so you'd have a IBV_MR_MULTI_THREADED to enable the RCU
optimization?

That seems sort of consistent with some of the other flags we've had
in the past (eg single threaded CQ polling optimization)

> I don't think a two order of magnitude improvement is a micro optimization.

The micro optimization was tring to optimize rwlock with percpu and
RCU. The two order of magnitude penalty on the destroy and the new
need for tuning knobs is the penalty for that.

I doubt the percpu optimization was two orders of magnitude..

> So the RCU grace period is problematic in this context as well.

Of course, RCU is not designed to have these kinds of performance
characteristics. If you define destroy to be a hot path then you can't
use RCU here, the worst case RCU grace period times are potentually
quite big..

This is why you shouldn't have the RCU optimization on by default at
all.

Usually RCU grace period latency is solved by defering the write side
to an async rcu grace period callback - why not do that instead of
adding a flag? It feels like destroy is a reasonable candidate to do
that kind of trick.

Perhaps some kind of enhancement to percpu_rwsem such that it would
asynchronously call a function with the write side lock held? Looks
not to hard..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux