On Fri, 23 Sep 2022 00:46:58 +0100, Gavin Shan <gshan@xxxxxxxxxx> wrote: > > Hi Peter, > > On 9/23/22 7:38 AM, Peter Xu wrote: > > On Thu, Sep 22, 2022 at 06:01:28PM +0100, Marc Zyngier wrote: > >> The current implementation of the dirty ring has an implicit requirement > >> that stores to the dirty ring from userspace must be: > >> > >> - be ordered with one another > >> > >> - visible from another CPU executing a ring reset > >> > >> While these implicit requirements work well for x86 (and any other > >> TSO-like architecture), they do not work for more relaxed architectures > >> such as arm64 where stores to different addresses can be freely > >> reordered, and loads from these addresses not observing writes from > >> another CPU unless the required barriers (or acquire/release semantics) > >> are used. > >> > >> In order to start fixing this, upgrade the ring reset accesses: > >> > >> - the kvm_dirty_gfn_harvested() helper now uses acquire semantics > >> so it is ordered after all previous writes, including that from > >> userspace > >> > >> - the kvm_dirty_gfn_set_invalid() helper now uses release semantics > >> so that the next_slot and next_offset reads don't drift past > >> the entry invalidation > >> > >> This is only a partial fix as the userspace side also need upgrading. > > > > Paolo has one fix 4802bf910e ("KVM: dirty ring: add missing memory > > barrier", 2022-09-01) which has already landed. > > > > I think the other one to reset it was lost too. I just posted a patch. > > > > https://lore.kernel.org/qemu-devel/20220922213522.68861-1-peterx@xxxxxxxxxx/ > > (link still not yet available so far, but should be) > > > >> > >> Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > >> --- > >> virt/kvm/dirty_ring.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c > >> index f4c2a6eb1666..784bed80221d 100644 > >> --- a/virt/kvm/dirty_ring.c > >> +++ b/virt/kvm/dirty_ring.c > >> @@ -79,12 +79,12 @@ static inline void kvm_dirty_gfn_set_invalid(struct kvm_dirty_gfn *gfn) > >> static inline void kvm_dirty_gfn_set_dirtied(struct > >> kvm_dirty_gfn *gfn) > >> { > >> - gfn->flags = KVM_DIRTY_GFN_F_DIRTY; > >> + smp_store_release(&gfn->flags, KVM_DIRTY_GFN_F_DIRTY); > > > > IIUC you meant kvm_dirty_gfn_set_invalid as the comment says? > > > > kvm_dirty_gfn_set_dirtied() has been guarded by smp_wmb() and AFAICT that's > > already safe. Otherwise looks good to me. > > > > If I'm understanding the full context, smp_store_release() also > enforces guard on 'gfn->flags' itself. It is needed by user space > for the synchronization. There are multiple things at play here: - userspace needs a store-release when making the flags 'harvested', so that the kernel using a load-acquire can observe this write (and avoid the roach-motel effect of a non-acquire load) - the kernel needs a store-release when making the flags 'invalid', preventing this write from occuring before the next_* fields have been sampled On the ring production side, there is a heavy handed smp_wmb(), which makes things pretty safe. M. -- Without deviation from the norm, progress is not possible.