On Tue, Apr 11, 2017 at 01:34:49PM +0800, Paolo Bonzini wrote: > kvm_arch_vcpu_should_kick() does cmpxchg, which already includes a > memory barrier when it succeeds, so you need not add smp_mb() there. When the cmpxchg() fails it only guarantees ACQUIRE semantics, meaning the request setting may appear to happen after its completion. This would break our delicate vcpu->requests, vcpu->mode two-variable memory barrier pattern that prohibits a VCPU entering guest mode with a pending request and no IPI. IOW, on ARM we need an explicit smp_mb() before the cmpxchg(), otherwise it's incomplete. I think adding a smp_mb__before_atomic() should cover ARM and any other relaxed memory model arches without impacting x86. Thanks, drew