Re: Real-time projects that could use userspace RCU

"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> · Sun, 13 Jun 2010 17:38:03 -0700

On Sun, Jun 13, 2010 at 04:51:19PM -0400, Mathieu Desnoyers wrote:
> * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > On Tue, May 11, 2010 at 10:43:37AM -0400, Mathieu Desnoyers wrote:
> > > * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > > > On Tue, May 11, 2010 at 09:14:04AM -0400, Mathieu Desnoyers wrote:
> > > > > * John Kacur (jkacur@xxxxxxxxxx) wrote:
> > > > > > On Tue, May 11, 2010 at 2:21 PM, Mathieu Desnoyers
> > > > > > <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
> > > > > > > Hi Thomas,
> > > > > > >
> > > > > > > Paul told me you were quite interested in the userspace RCU library when he told
> > > > > > > you about it (http://lttng.org/urcu). Do you have some userspace applications or
> > > > > > > libraries with real-time needs in mind that could use it ? We could help moving
> > > > > > > them to liburcu. The wait-free read-side is, as you certainly know, a
> > > > > > > characteristic of RCU that can be very useful to RT applications.
> > > > > > >
> > > > > > > [CCing linux-rt-users, as it seems appropriate to ask them too.]
> > > > > > >
> > > > > > > Thanks,
> > > > > > 
> > > > > > Do you have any kind of benchmarks? If you had something appropriate
> > > > > > we could add it to the rt-tests suite (which includes cyclictest). Not
> > > > > > only would this provide an objective measure, but it could also act as
> > > > > > a reference implementation for userspace programmers.
> > > > > 
> > > > > Yes, the library already has its set of benchmark test programs. The results can
> > > > > be found in http://lttng.org/pub/thesis/desnoyers-dissertation-2009-12.pdf
> > > > > section 6.5. It shows that RCU read-side is a few orders of magnitude faster
> > > > > than lock-based approaches and scales linearly with the number of cores.
> > > > > 
> > > > > The same PDF, sections 7.6.2 and 7.6.3, presents the architecture-level modeling
> > > > > of the RCU mb algorithm in Promela, along with the formal proof by model
> > > > > checking for both correctness and progress (the read-side is proven wait-free).
> > > > > 
> > > > > > 
> > > > > > See here.
> > > > > > git clone git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rt-tests.git
> > > > > > 
> > > > > > cyclictest is the original program written by Thomas, maintained by
> > > > > > Clark Williams now. Most - but not all, of the additional tests are
> > > > > > modelled after this program, so you might want to have a look at that
> > > > > > if you're not already familiar with it.
> > > > > 
> > > > > Thanks for the pointer, I did know about cyclictest, but not the others. Since
> > > > > the read-side does not involve the OS nor blocking, I wonder which of these
> > > > > tests would be even a near-match though.
> > > > 
> > > > Why not add mutual-exclusion tests, including locking, per-thread locking,
> > > > reader-writer locking, and RCU?  The figure of merit would be maximum
> > > > latency rather than throughput, but the existing userspace-rcu tests should
> > > > be pretty close.
> > > > 
> > > 
> > > Do you mean adding our RCU tests to the rt-tests.git tree or adding more
> > > information in our own tests ? Also, the maximum latency is quite dependent on
> > > the rest of the workload running on the system, so we might have to generate
> > > such a workload while the test runs to give an interesting and accurate view of
> > > the maximum latency.
> > > 
> > > Maybe running one (or many) of the already existing rt-tests in parallel would
> > > do.
> > > 
> > > Thoughts ?
> > 
> > My thought was a variant of our existing RCU tests.  Something like:
> > 
> > 	clock_gettime();
> > 	pthread_mutex_lock();
> > 	clock_gettime();
> > 	/* compute latency, accumulate average and maximum */
> > 
> > The test thread would need to have real-time priority.  Then print the
> > maximums for various mechanisms.
> > 
> > Does this seem like a reasonable approach?
> 
> (sorry for delayed answer, I've been deeply focusing on ring buffer
> implementation lately)
> 
> Hrm, the only thing I'm afraid of is that RT latency tests is quite different
> from throughput benchmarks. Basically, for RCU throughput benchmark, just
> creating a few simple tests is fine. However, in the case of RT behavior
> testing, creating this test thread is just one part of the equation. Quickly
> reproducing conditions that can lead to priority inversions in an automated way
> should also be part of the picture.
> 
> In addition, we might want to measure the overall time it takes to get the lock,
> execute the C.S. and release the lock, rather that just assuming that only the
> "lock" part matters. Some "clever" token-based fair locking scheme can have a
> more evolved unlock primitive for instance.

Good point, we would need to test the full picture as well as all of
the pieces.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html