On Tue, Dec 17, 2024, Maxim Levitsky wrote: > On Fri, 2024-12-13 at 17:07 -0800, Sean Christopherson wrote: > > Sync the new iteration to the guest prior to restarting the vCPU, otherwise > > it's possible for the vCPU to dirty memory for the next iteration using the > > current iteration's value. > > > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > > --- > > tools/testing/selftests/kvm/dirty_log_test.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c > > index cdae103314fc..41c158cf5444 100644 > > --- a/tools/testing/selftests/kvm/dirty_log_test.c > > +++ b/tools/testing/selftests/kvm/dirty_log_test.c > > @@ -859,9 +859,9 @@ static void run_test(enum vm_guest_mode mode, void *arg) > > */ > > if (++iteration == p->iterations) > > WRITE_ONCE(host_quit, true); > > - > > - sem_post(&sem_vcpu_cont); > > sync_global_to_guest(vm, iteration); > > + > > + sem_post(&sem_vcpu_cont); > > } > > > > pthread_join(vcpu_thread, NULL); > > > AFAIK, this patch doesn't 100% gurantee that this won't happen: > > The READ_ONCE that guest uses only guarntees no wierd compiler optimizations > are used. The guest can still read the iteration value to a register, get > #vmexit, after which the iteration will be increased and then write the old > value. Hmm, right, it's not 100% guaranteed because of the register caching angle. But it does guarantee that at most only write can retire with the previous iteration, and patch 1 from you addresses that issue, so I think this is solid? Assuming we end up going with the "collect everything for the current iteration", I'll expand the changelog to call out the dependency along with exactly what protection this does and does not provide > Is this worth to reorder this to decrease the chances of this happening? I am > not sure, as this will just make this problem rarer and thus harder to debug. > Currently the test just assumes that this can happen and deals with this. The test deals with it by effectively disabling verification. IMO, that's just hacking around a bug.