On Sun, Jun 13, 2010 at 04:51:19PM -0400, Mathieu Desnoyers wrote: > * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote: > > On Tue, May 11, 2010 at 10:43:37AM -0400, Mathieu Desnoyers wrote: > > > * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote: > > > > On Tue, May 11, 2010 at 09:14:04AM -0400, Mathieu Desnoyers wrote: > > > > > * John Kacur (jkacur@xxxxxxxxxx) wrote: > > > > > > On Tue, May 11, 2010 at 2:21 PM, Mathieu Desnoyers > > > > > > <mathieu.desnoyers@xxxxxxxxxxxx> wrote: > > > > > > > Hi Thomas, > > > > > > > > > > > > > > Paul told me you were quite interested in the userspace RCU library when he told > > > > > > > you about it (http://lttng.org/urcu). Do you have some userspace applications or > > > > > > > libraries with real-time needs in mind that could use it ? We could help moving > > > > > > > them to liburcu. The wait-free read-side is, as you certainly know, a > > > > > > > characteristic of RCU that can be very useful to RT applications. > > > > > > > > > > > > > > [CCing linux-rt-users, as it seems appropriate to ask them too.] > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Do you have any kind of benchmarks? If you had something appropriate > > > > > > we could add it to the rt-tests suite (which includes cyclictest). Not > > > > > > only would this provide an objective measure, but it could also act as > > > > > > a reference implementation for userspace programmers. > > > > > > > > > > Yes, the library already has its set of benchmark test programs. The results can > > > > > be found in http://lttng.org/pub/thesis/desnoyers-dissertation-2009-12.pdf > > > > > section 6.5. It shows that RCU read-side is a few orders of magnitude faster > > > > > than lock-based approaches and scales linearly with the number of cores. > > > > > > > > > > The same PDF, sections 7.6.2 and 7.6.3, presents the architecture-level modeling > > > > > of the RCU mb algorithm in Promela, along with the formal proof by model > > > > > checking for both correctness and progress (the read-side is proven wait-free). > > > > > > > > > > > > > > > > > See here. > > > > > > git clone git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rt-tests.git > > > > > > > > > > > > cyclictest is the original program written by Thomas, maintained by > > > > > > Clark Williams now. Most - but not all, of the additional tests are > > > > > > modelled after this program, so you might want to have a look at that > > > > > > if you're not already familiar with it. > > > > > > > > > > Thanks for the pointer, I did know about cyclictest, but not the others. Since > > > > > the read-side does not involve the OS nor blocking, I wonder which of these > > > > > tests would be even a near-match though. > > > > > > > > Why not add mutual-exclusion tests, including locking, per-thread locking, > > > > reader-writer locking, and RCU? The figure of merit would be maximum > > > > latency rather than throughput, but the existing userspace-rcu tests should > > > > be pretty close. > > > > > > > > > > Do you mean adding our RCU tests to the rt-tests.git tree or adding more > > > information in our own tests ? Also, the maximum latency is quite dependent on > > > the rest of the workload running on the system, so we might have to generate > > > such a workload while the test runs to give an interesting and accurate view of > > > the maximum latency. > > > > > > Maybe running one (or many) of the already existing rt-tests in parallel would > > > do. > > > > > > Thoughts ? > > > > My thought was a variant of our existing RCU tests. Something like: > > > > clock_gettime(); > > pthread_mutex_lock(); > > clock_gettime(); > > /* compute latency, accumulate average and maximum */ > > > > The test thread would need to have real-time priority. Then print the > > maximums for various mechanisms. > > > > Does this seem like a reasonable approach? > > (sorry for delayed answer, I've been deeply focusing on ring buffer > implementation lately) > > Hrm, the only thing I'm afraid of is that RT latency tests is quite different > from throughput benchmarks. Basically, for RCU throughput benchmark, just > creating a few simple tests is fine. However, in the case of RT behavior > testing, creating this test thread is just one part of the equation. Quickly > reproducing conditions that can lead to priority inversions in an automated way > should also be part of the picture. > > In addition, we might want to measure the overall time it takes to get the lock, > execute the C.S. and release the lock, rather that just assuming that only the > "lock" part matters. Some "clever" token-based fair locking scheme can have a > more evolved unlock primitive for instance. Good point, we would need to test the full picture as well as all of the pieces. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html