Re: [PATCH 4/6] kvm tools: Add rwlock wrapper

"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> · Fri, 27 May 2011 10:22:29 -0700

On Fri, May 27, 2011 at 11:12:20AM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> 
> > > > > I'm CC'ing Paul and Mathieu as well for urcu.
> > 
> > I am hoping we can get better convergence between the user-level 
> > and kernel-level URCU implementations once I get SRCU merged into 
> > the TREE_RCU and TINY_RCU implementations. [...]
> 
> Yeah.
> 
> > [...]  But it is early days for user-level RCU implementations -- 
> > for example, the kernel-level implementations have deep 
> > dependencies on being able to lock themselves cheaply to a given 
> > CPU, which does not exist at user level.
> 
> Correct - this is why i suggested a plain copy first, then look back 
> how we (and whether we!) want to share logic.

OK, here is an approach that I rejected long ago due to its not handling
existing code bases nicely.  But it should work fine for user-space
applications that are willing to adapt themselves to RCU, so it is well
worth considering for this set of use cases.

The basic trick is to pretend that each user-level thread is its own CPU.
This is easiest if the application does not do any RCU activity from
signal handlers, though app-code signal handlers can be dealt with as
well, if needed.  (But I hate POSIX signals...)

Given this trick, the code that is currently invoked from the
scheduling-clock interrupt can be instead invoked from a per-thread
SIGALRM.

Given the current implementation in -tip, the RCU core processing can be
done from separate threads, but things must be tweaked because TREE_RCU
assumes that the RCU core processing happens on the same CPU (which now
becomes in the same thread) that it corresponds to.  In other words,
the rcuc0 thread is hard-affinitied to CPU 0, the rcuc1 thread to CPU 1,
and so on.

One way to handle this would be to do the per-CPU kthread processing
in signal-handler context.  Then code segments that disable interrupts
(the __call_rcu() function comes immediately to mind) must block the
corresponding signals.  Which can easily be abstracted so that common
code handles it.

We could handle dyntick-idle if there is a convenient way to get
notification when a thread blocks (as opposed to being preempted).
There are a number of strategies that might work here, the first that
comes to mind is to notify only if the block is TASK_INTERRUPTIBLE,
which indicates a relatively long-term sleep.  This notification could
call rcu_enter_nohz() and friends.  So, is there a way to get
notification on TASK_INTERRUPTIBLE blocking and unblocking?

This is not a general-purpose solution (which is why I rejected it when
thinking along these lines some years ago), but it would be an interesting
way to share the in-kernel code.  And I believe that this approach would
be quite useful to a great number of user-level apps/tools/utilities
that were willing to live within its constraints.

The each-thread-is-a-CPU might seem limiting, but the current TREE_RCU
implementation would allow up to 4,194,304 threads on a 64-bit system
and up to 524,288 on a 32-bit system, which should prove sufficient for
most uses.  Famous last words...  But it would be easy to add a fifth
level of hierarchy if someone really does have a legitimate need for more
threads, which would bring us to 268,435,456 threads for 64-bit systems
and 16,777,216 threads for 32-bit systems.  And it is easy to add more
levels -- and the extra levels don't penalize people who don't need them.
With the current implementation, the maximum number of threads would
need to be specified at compile time, but again, this should be OK in
almost all cases.  Default to (say) 131,072 threads maximum and be happy.

> > But there seems to be an assumption that there should be only one 
> > URCU implementation, and I am not sure that this assumption holds.  
> 
> I'm not sure about that either. And sice tools/kvm/ lives in the 
> kernel repo it would be a mortal sin [*] to not explore the code 
> sharing angle!!! :-)
> 
> If a reasonable amount of sharing of logic is possible without making 
> it painful for the kernel RCU code we could do other nice things like 
> change the RCU logic and test it in user-space first and run 
> user-space rcutorture on some really big cluster.

That would be cool -- also, it would make the Linux-kernel code
more accessible, because people could play with it in userspace,
single-stepping, setting breakpoints, and so on.

> > [ ... ]
> >
> > All that aside, one advantage of http://lttng.org/urcu is that it 
> > already exists, which allows prototyping to proceed immediately.  
> 
> it's offline right now:
> 
>  $ git clone git://git.lttng.org/urcu
>  Cloning into urcu...
>  fatal: The remote end hung up unexpectedly
> 
> One complication is that it's LGPL while tools/kvm/ is GPLv2. I guess 
> we could copy a suitable implementation into tools/kvm/rcu/?

That is another reason why I believe that an in-kernel-tree version
of URCU is not a replacement for the variant that Mathieu is maintaining
(and that I am contributing to).  Mathieu's implementation can be used
by non-GPL applications and by applications that are not closely tied
to the Linux kernel.

So I really don't see a problem with having both of them around.

> Thanks,
> 
> 	Ingo
> 
> [1] punishable by death or eternal hacking of a Windows driver (i'd pick the former)

Ouch!!!

One of my college buddies left the field due to Windows taking over large
chunks of the embedded space in the 1990s.  So, like you, he definitely
rejected the latter.  ;-)

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html