On Tue, Jul 19, 2022 at 10:08:30AM +0800, Gavin Shan wrote: > In rseq_test, there are two threads, which are vCPU thread and migration > worker separately. Unfortunately, the test has the wrong PID passed to > sched_setaffinity() in the migration worker. It forces migration on the > migration worker because zeroed PID represents the calling thread, which > is the migration worker itself. It means the vCPU thread is never enforced > to migration and it can migrate at any time, which eventually leads to > failure as the following logs show. > > host# uname -r > 5.19.0-rc6-gavin+ > host# # cat /proc/cpuinfo | grep processor | tail -n 1 > processor : 223 > host# pwd > /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm > host# for i in `seq 1 100`; do \ > echo "--------> $i"; ./rseq_test; done > --------> 1 > --------> 2 > --------> 3 > --------> 4 > --------> 5 > --------> 6 > ==== Test Assertion Failure ==== > rseq_test.c:265: rseq_cpu == cpu > pid=3925 tid=3925 errno=4 - Interrupted system call > 1 0x0000000000401963: main at rseq_test.c:265 (discriminator 2) > 2 0x0000ffffb044affb: ?? ??:0 > 3 0x0000ffffb044b0c7: ?? ??:0 > 4 0x0000000000401a6f: _start at ??:? > rseq CPU = 4, sched CPU = 27 > > Fix the issue by passing correct parameter, TID of the vCPU thread, to > sched_setaffinity() in the migration worker. > > Fixes: 61e52f1630f5 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs") > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> > Signed-off-by: Gavin Shan <gshan@xxxxxxxxxx> > Reviewed-by: Oliver Upton <oliver.upton@xxxxxxxxx> > --- > v4: Pick the code change as Sean suggested. > --- > tools/testing/selftests/kvm/rseq_test.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c > index 4158da0da2bb..2237d1aac801 100644 > --- a/tools/testing/selftests/kvm/rseq_test.c > +++ b/tools/testing/selftests/kvm/rseq_test.c > @@ -82,8 +82,9 @@ static int next_cpu(int cpu) > return cpu; > } > > -static void *migration_worker(void *ign) > +static void *migration_worker(void *__rseq_tid) > { > + pid_t rseq_tid = (pid_t)(unsigned long)__rseq_tid; > cpu_set_t allowed_mask; > int r, i, cpu; > > @@ -106,7 +107,7 @@ static void *migration_worker(void *ign) > * stable, i.e. while changing affinity is in-progress. > */ > smp_wmb(); > - r = sched_setaffinity(0, sizeof(allowed_mask), &allowed_mask); > + r = sched_setaffinity(rseq_tid, sizeof(allowed_mask), &allowed_mask); > TEST_ASSERT(!r, "sched_setaffinity failed, errno = %d (%s)", > errno, strerror(errno)); > smp_wmb(); > @@ -231,7 +232,8 @@ int main(int argc, char *argv[]) > vm = vm_create_default(VCPU_ID, 0, guest_code); > ucall_init(vm, NULL); > > - pthread_create(&migration_thread, NULL, migration_worker, 0); > + pthread_create(&migration_thread, NULL, migration_worker, > + (void *)(unsigned long)gettid()); > > for (i = 0; !done; i++) { > vcpu_run(vm, VCPU_ID); > -- > 2.23.0 Thanks for figuring this out Gavin and Sean! Reviewed-by: Andrew Jones <andrew.jones@xxxxxxxxx>