Hi Sean, We ran into kvm/selftests/rseq_test failure on ARM64, like below. I'm not sure if you have quick idea about the root cause. The issue can't be reproduced 100%, meaning it happens occasionally. host# uname -r 5.19.0-rc6-gavin+ host# # cat /proc/cpuinfo | grep processor | tail -n 1 processor : 223 host# pwd /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm host# for i in `seq 1 100`; do echo "--------> $i"; ./rseq_test; sleep 3; done --------> 1 --------> 2 --------> 3 --------> 4 --------> 5 --------> 6 ==== Test Assertion Failure ==== rseq_test.c:265: rseq_cpu == cpu pid=3925 tid=3925 errno=4 - Interrupted system call 1 0x0000000000401963: main at rseq_test.c:265 (discriminator 2) 2 0x0000ffffb044affb: ?? ??:0 3 0x0000ffffb044b0c7: ?? ??:0 4 0x0000000000401a6f: _start at ??:? rseq CPU = 4, sched CPU = 27 Looking at tools/testing/selftests/kvm/rseq_test.c, it seems we have the assumption that the 'main' and 'migration_worker' threads are synchronized in migration state. However, I'm not sure it's correct. I think it's still possible that process migration on 'main' thread could happen between the calls to sched_getcpu() and READ_ONCE()? int main(int argc, char *argv[]) { : for (i = 0; !done; i++) { : do { /* * Drop bit 0 to force a mismatch if the count is odd, * i.e. if a migration is in-progress. */ snapshot = atomic_read(&seq_cnt) & ~1; /* * Ensure reading sched_getcpu() and rseq.cpu_id * complete in a single "no migration" window, i.e. are * not reordered across the seq_cnt reads. */ smp_rmb(); cpu = sched_getcpu(); /* process migration may happen after sched_getcpu() ? */ rseq_cpu = READ_ONCE(__rseq.cpu_id); smp_rmb(); } while (snapshot != atomic_read(&seq_cnt)); } } Thanks, Gavin _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm