[Issue] kvm/selftests/rseq_test failure on ARM64

Gavin Shan <gshan@xxxxxxxxxx> · Tue, 12 Jul 2022 13:24:45 +1000

Hi Sean,

We ran into kvm/selftests/rseq_test failure on ARM64, like below. I'm not sure
if you have quick idea about the root cause. The issue can't be reproduced
100%, meaning it happens occasionally.

host# uname -r
5.19.0-rc6-gavin+
host# # cat /proc/cpuinfo | grep processor | tail -n 1
processor	: 223
host# pwd
/home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
host# for i in `seq 1 100`; do echo "--------> $i"; ./rseq_test; sleep 3; done
--------> 1
--------> 2
--------> 3
--------> 4
--------> 5
--------> 6
==== Test Assertion Failure ====
  rseq_test.c:265: rseq_cpu == cpu
  pid=3925 tid=3925 errno=4 - Interrupted system call
     1  0x0000000000401963: main at rseq_test.c:265 (discriminator 2)
     2  0x0000ffffb044affb: ?? ??:0
     3  0x0000ffffb044b0c7: ?? ??:0
     4  0x0000000000401a6f: _start at ??:?
  rseq CPU = 4, sched CPU = 27

Looking at tools/testing/selftests/kvm/rseq_test.c, it seems we have the assumption
that the 'main' and 'migration_worker' threads are synchronized in migration state.
However, I'm not sure it's correct. I think it's still possible that process migration
on 'main' thread could happen between the calls to sched_getcpu() and READ_ONCE()?

int main(int argc, char *argv[])
{
    :
    for (i = 0; !done; i++) {
        :
        do {
            /*
             * Drop bit 0 to force a mismatch if the count is odd,
             * i.e. if a migration is in-progress.
             */
             snapshot = atomic_read(&seq_cnt) & ~1;

             /*
              * Ensure reading sched_getcpu() and rseq.cpu_id
              * complete in a single "no migration" window, i.e. are
              * not reordered across the seq_cnt reads.
              */
              smp_rmb();
              cpu = sched_getcpu();                         /* process migration may happen after sched_getcpu() ? */
              rseq_cpu = READ_ONCE(__rseq.cpu_id);
              smp_rmb();
          } while (snapshot != atomic_read(&seq_cnt));
     }
}

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm