* Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote: > On Sun, May 29, 2011 at 06:00:00PM +0300, Avi Kivity wrote: > > On 05/29/2011 05:27 PM, Ingo Molnar wrote: > > >* Avi Kivity<avi@xxxxxxxxxx> wrote: > > > > > >> I don't understand how you expect per_cpu to work in userspace. As > > >> soon as you calculate the per-cpu address, it can be invalidated. > > >> It doesn't help that you get a signal; you've already calculated > > >> the address. > > > > > >I was thinking of some sort of transactional mechanism, a tightly > > >controlled set of assembly instructions updating the percpu fields, > > >where the migration event would be able to 'unroll' incomplete > > >modifications done to the 'wrong' percpu data structure. (It would be > > >rather complex and every percpu op would have to be an atomic because > > >there's always the chance that it's executed on the wrong CPU.) > > > > > >But note that we do not even need any notification if there's a > > >(local) lock on the percpu fields: > > > > > >It will work because it's statistically percpu the lock will not > > >SMP-bounce between CPUs generally so it will be very fast to > > >acquire/release it, and we get the full cache benefits of percpu > > >variables. > > > > > >The migration notification would still be useful to detect grace > > >periods at natural points - but as Paul pointed out polling it via > > >SIGALRM works as well. The two (migration and SIGALRM) could be > > >combined as well. > > > > I think it's way simpler to map cpu == thread. And in fact, when > > you run a Linux kernel in a kvm guest, that's what happens, since > > each vcpu _is_ a host thread. > > I have to agree with Avi here. If a stop_machine()-like approach is > going to work, the updates have to be very rare, so any additional > cache-nonlocality from having lots of threads should not be a problem. > Especially given that in this particular case, there are exactly as > many CPUs as threads anyway. The readers should only need to touch a > constant number of cache lines either way. > > Or am I missing something here? I was talking about the (still imaginery!) user-space tree-RCU code! :-) The stop_machine_run()-alike thing is for brlocks - for which Sasha sent patches already, see these patches on the kvm@xxxxxxxxxxxxxxx list: [PATCH 3/4] kvm tools: Add a brlock [PATCH 4/4] kvm tools: Use brlock in MMIO and IOPORT Wrt. brlocks, we'll keep them as simple as possible and indeed no involved tricks are needed AFAICS. read_lock() will be a compiler barrier(), that's as fast as it gets :-) Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html