On Wed, 2022-11-02 at 12:24 +0800, Gavin Shan wrote: > Hi Robert, > > On 11/2/22 10:01 AM, Robert Hoo wrote: > > vDSO getcpu() has been in Kernel since 2.6.19, which we can assume > > generally available. > > Use vDSO getcpu() to reduce the overhead, so that vcpu thread > > stalls less > > therefore can have more odds to hit the race condition. > > > > It would be nice to provide more context to explain how the race > condition is caused. OK. How about this? ... hit the race condition that vcpu_run() inside need to handle pcpu migration triggered by sched_setaffinity() in migration thread. > > > Fixes: 0fcc102923de ("KVM: selftests: Use getcpu() instead of > > sched_getcpu() in rseq_test") > > Signed-off-by: Robert Hoo <robert.hu@xxxxxxxxxxxxxxx> > > --- > > tools/testing/selftests/kvm/rseq_test.c | 32 ++++++++++++++++++ > > ------- > > 1 file changed, 24 insertions(+), 8 deletions(-) > > > > diff --git a/tools/testing/selftests/kvm/rseq_test.c > > b/tools/testing/selftests/kvm/rseq_test.c > > index 6f88da7e60be..0b68a6b19b31 100644 > > --- a/tools/testing/selftests/kvm/rseq_test.c > > +++ b/tools/testing/selftests/kvm/rseq_test.c > > @@ -42,15 +42,29 @@ static void guest_code(void) > > } > > > > /* > > - * We have to perform direct system call for getcpu() because it's > > - * not available until glic 2.29. > > + * getcpu() was added in kernel 2.6.19. glibc support wasn't there > > + * until glibc 2.29. > > + * We can direct call it from vdso to ease gblic dependency. > > + * > > + * vdso manipulation code refers from > > selftests/x86/test_vsyscall.c > > */ > > -static void sys_getcpu(unsigned *cpu) > > -{ > > - int r; > > +typedef long (*getcpu_t)(unsigned *, unsigned *, void *); > > +static getcpu_t vdso_getcpu; > > > > - r = syscall(__NR_getcpu, cpu, NULL, NULL); > > - TEST_ASSERT(!r, "getcpu failed, errno = %d (%s)", errno, > > strerror(errno)); > > +static void init_vdso(void) > > +{ > > + void *vdso = dlopen("linux-vdso.so.1", RTLD_LAZY | RTLD_LOCAL | > > + RTLD_NOLOAD); > > + if (!vdso) > > + vdso = dlopen("linux-gate.so.1", RTLD_LAZY | RTLD_LOCAL > > | > > + RTLD_NOLOAD); > > + if (!vdso) > > + TEST_ASSERT(!vdso, "failed to find vDSO\n"); > > + > > + vdso_getcpu = (getcpu_t)dlsym(vdso, "__vdso_getcpu"); > > + if (!vdso_getcpu) > > + TEST_ASSERT(!vdso_getcpu, > > + "failed to find __vdso_getcpu in vDSO\n"); > > } > > > > As the comments say, vdso manipulation code comes from > selftests/x86/test_vsyscall.c. > I would guess 'linux-vdso.so.1' and 'linux-gate.so.1' are x86 > specific. If I'm correct, > the test case will fail on other architectures, including ARM64. > Ah, right, thanks. Fortunately ARM and x86 share same vDSO name, and we can define macros for variations. user ABI vDSO name ????????????????????????????? aarch64 linux-vdso.so.1 arm linux-vdso.so.1 ia64 linux-gate.so.1 mips linux-vdso.so.1 ppc/32 linux-vdso32.so.1 ppc/64 linux-vdso64.so.1 s390 linux-vdso32.so.1 s390x linux-vdso64.so.1 sh linux-gate.so.1 i386 linux-gate.so.1 x86-64 linux-vdso.so.1 x86/x32 linux-vdso.so.1 While unfortunately, looks like ARM vDSO doesn't have getcpu(). In that case, we might roll back to syscall(__NR_getcpu)? aarch64 functions The table below lists the symbols exported by the vDSO. symbol version -------------------------------------- __kernel_rt_sigreturn LINUX_2.6.39 __kernel_gettimeofday LINUX_2.6.39 __kernel_clock_gettime LINUX_2.6.39 __kernel_clock_getres LINUX_2.6.39 https://man7.org/linux/man-pages/man7/vdso.7.html > > static int next_cpu(int cpu) > > @@ -205,6 +219,8 @@ int main(int argc, char *argv[]) > > struct kvm_vcpu *vcpu; > > u32 cpu, rseq_cpu; > > > > + init_vdso(); > > + > > /* Tell stdout not to buffer its content */ > > setbuf(stdout, NULL); > > > > @@ -253,7 +269,7 @@ int main(int argc, char *argv[]) > > * across the seq_cnt reads. > > */ > > smp_rmb(); > > - sys_getcpu(&cpu); > > + vdso_getcpu(&cpu, NULL, NULL); > > rseq_cpu = rseq_current_cpu_raw(); > > smp_rmb(); > > } while (snapshot != atomic_read(&seq_cnt)); > > > > Thanks, > Gavin >