I've used the qsbr flavor of urcu with success. It works great. The doc, benchmark code make clear that RCU writers, * must use rcu_xchg_pointer() or rcu_assign_pointer() to update data structure * must call 'synchronize_rcu()' which blocks until readers are out of critical section on old copy so that the old data structure can be cleaned up. However, synchronize_rcu() is not giving me the write performance I want. But 'tests/benchmark/test_urcu_defer.c' does - cool! Regrettably, the doc on 'defer_rcu' just isn't clear to me. When 'defer_rcu' runs the callback specified in its arguments can I conclude, like 'synchronize_rcu()' that there are no/none/zero readers in a critical section on the old data structure? Is this the intended usage? ``` void deferCallBack(void *oldData) { Foo *old = (Foo*)oldData; // In this function I know for sure no RCU reader // is in a critical section in 'old'. I can free/mutate // it as needed . . . } void rcuWriterLoop() { rcu_defer_register_thread(); while (!done) { Foo *newCopy = .... // Prior to this line readers are not in a read critical section (CS) // or in CS on 'old'. On return readers are not in a CS or in // a CS in newCopy only. Foo *old = rcu_xchg_pointer(¤t, newCopy); // Cleanup 'old': readers can't be accessing it defer_rcu(deferCallBack, old); } rcu_defer_unregister_thread(); } ```