On Tue, Sep 07, 2021 at 09:14:54AM -0700, Raghavendra Rao Ananta wrote: > On Sun, Sep 5, 2021 at 11:39 PM Andrew Jones <drjones@xxxxxxxxxx> wrote: > > > > On Fri, Sep 03, 2021 at 01:53:27PM -0700, Raghavendra Rao Ananta wrote: > > > On Fri, Sep 3, 2021 at 4:05 AM Andrew Jones <drjones@xxxxxxxxxx> wrote: > > > > > > > > On Wed, Sep 01, 2021 at 09:14:12PM +0000, Raghavendra Rao Ananta wrote: > > > > > Since the timer stack (hardware and KVM) is per-CPU, there > > > > > are potential chances for races to occur when the scheduler > > > > > decides to migrate a vCPU thread to a different physical CPU. > > > > > Hence, include an option to stress-test this part as well by > > > > > forcing the vCPUs to migrate across physical CPUs in the > > > > > system at a particular rate. > > > > > > > > > > Originally, the bug for the fix with commit 3134cc8beb69d0d > > > > > ("KVM: arm64: vgic: Resample HW pending state on deactivation") > > > > > was discovered using arch_timer test with vCPU migrations and > > > > > can be easily reproduced. > > > > > > > > > > Signed-off-by: Raghavendra Rao Ananta <rananta@xxxxxxxxxx> > > > > > --- > > > > > .../selftests/kvm/aarch64/arch_timer.c | 108 +++++++++++++++++- > > > > > 1 file changed, 107 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/tools/testing/selftests/kvm/aarch64/arch_timer.c b/tools/testing/selftests/kvm/aarch64/arch_timer.c > > > > > index 1383f33850e9..de246c7afab2 100644 > > > > > --- a/tools/testing/selftests/kvm/aarch64/arch_timer.c > > > > > +++ b/tools/testing/selftests/kvm/aarch64/arch_timer.c > > > > > @@ -14,6 +14,8 @@ > > > > > * > > > > > * The test provides command-line options to configure the timer's > > > > > * period (-p), number of vCPUs (-n), and iterations per stage (-i). > > > > > + * To stress-test the timer stack even more, an option to migrate the > > > > > + * vCPUs across pCPUs (-m), at a particular rate, is also provided. > > > > > * > > > > > * Copyright (c) 2021, Google LLC. > > > > > */ > > > > > @@ -24,6 +26,8 @@ > > > > > #include <pthread.h> > > > > > #include <linux/kvm.h> > > > > > #include <linux/sizes.h> > > > > > +#include <linux/bitmap.h> > > > > > +#include <sys/sysinfo.h> > > > > > > > > > > #include "kvm_util.h" > > > > > #include "processor.h" > > > > > @@ -41,12 +45,14 @@ struct test_args { > > > > > int nr_vcpus; > > > > > int nr_iter; > > > > > int timer_period_ms; > > > > > + int migration_freq_ms; > > > > > }; > > > > > > > > > > static struct test_args test_args = { > > > > > .nr_vcpus = NR_VCPUS_DEF, > > > > > .nr_iter = NR_TEST_ITERS_DEF, > > > > > .timer_period_ms = TIMER_TEST_PERIOD_MS_DEF, > > > > > + .migration_freq_ms = 0, /* Turn off migrations by default */ > > > > > > > > I'd rather we enable good tests like these by default. > > > > > > > Well, that was my original idea, but I was concerned about the ease > > > for diagnosing > > > things since it'll become too noisy. And so I let it as a personal > > > preference. But I can > > > include it back and see how it goes. > > > > Break the tests into two? One run without migration and one with. If > > neither work, then we can diagnose the one without first, possibly > > avoiding the need to diagnose the one with. > > > Right. I guess that's where the test's migration switch can come in handy. > Let's turn migration ON by default. If we see a failure, we can turn > OFF migration > and run the tests. I'll try to include some verbose printing for diagnosability. > Sounds good. You can also consider using pr_debug if you feel the need to balance verbosity with diagnostics. Thanks, drew