On Tue, 2022-03-08 at 11:31 +0100, Claudio Imbrenda wrote: > On Mon, 07 Mar 2022 14:03:45 -0500 > Eric Farman <farman@xxxxxxxxxxxxx> wrote: > > > On Mon, 2022-03-07 at 16:30 +0100, Claudio Imbrenda wrote: > > > On Thu, 3 Mar 2022 22:04:23 +0100 > > > Eric Farman <farman@xxxxxxxxxxxxx> wrote: > > > > > > > When stopping a CPU, kvm-unit-tests serializes/waits for > > > > everything > > > > to finish, in order to get a consistent result whenever those > > > > functions are used. > > > > > > > > But to test the SIGP STOP itself, these additional measures > > > > could > > > > mask other problems. For example, did the STOP work, or is the > > > > CPU > > > > still operating? > > > > > > > > Let's create a non-waiting SIGP STOP and use it here, to ensure > > > > that > > > > the CPU is correctly stopped. A smp_cpu_stopped() call will > > > > still > > > > be used to see that the SIGP STOP has been processed, and the > > > > state > > > > of the CPU can be used to determine whether the test > > > > passes/fails. > > > > > > > > Signed-off-by: Eric Farman <farman@xxxxxxxxxxxxx> > > > > --- > > > > lib/s390x/smp.c | 25 +++++++++++++++++++++++++ > > > > lib/s390x/smp.h | 1 + > > > > s390x/smp.c | 10 ++-------- > > > > 3 files changed, 28 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/lib/s390x/smp.c b/lib/s390x/smp.c > > > > index 368d6add..84e536e8 100644 > > > > --- a/lib/s390x/smp.c > > > > +++ b/lib/s390x/smp.c > > > > @@ -119,6 +119,31 @@ int smp_cpu_stop(uint16_t idx) > > > > return rc; > > > > } > > > > > > > > +/* > > > > + * Functionally equivalent to smp_cpu_stop(), but without the > > > > + * elements that wait/serialize matters itself. > > > > + * Used to see if KVM itself is serialized correctly. > > > > + */ > > > > +int smp_cpu_stop_nowait(uint16_t idx) > > > > +{ > > > > + /* refuse to work on the boot CPU */ > > > > + if (idx == 0) > > > > + return -1; > > > > + > > > > + spin_lock(&lock); > > > > + > > > > + /* Don't suppress a CC2 with sigp_retry() */ > > > > + if (smp_sigp(idx, SIGP_STOP, 0, NULL)) { > > > > + spin_unlock(&lock); > > > > + return -1; > > > > + } > > > > + > > > > + cpus[idx].active = false; > > > > + spin_unlock(&lock); > > > > + > > > > + return 0; > > > > +} > > > > + > > > > int smp_cpu_stop_store_status(uint16_t idx) > > > > { > > > > int rc; > > > > diff --git a/lib/s390x/smp.h b/lib/s390x/smp.h > > > > index 1e69a7de..bae03dfd 100644 > > > > --- a/lib/s390x/smp.h > > > > +++ b/lib/s390x/smp.h > > > > @@ -44,6 +44,7 @@ bool smp_sense_running_status(uint16_t idx); > > > > int smp_cpu_restart(uint16_t idx); > > > > int smp_cpu_start(uint16_t idx, struct psw psw); > > > > int smp_cpu_stop(uint16_t idx); > > > > +int smp_cpu_stop_nowait(uint16_t idx); > > > > int smp_cpu_stop_store_status(uint16_t idx); > > > > int smp_cpu_destroy(uint16_t idx); > > > > int smp_cpu_setup(uint16_t idx, struct psw psw); > > > > diff --git a/s390x/smp.c b/s390x/smp.c > > > > index 50811bd0..11c2c673 100644 > > > > --- a/s390x/smp.c > > > > +++ b/s390x/smp.c > > > > @@ -76,14 +76,8 @@ static void test_restart(void) > > > > > > > > static void test_stop(void) > > > > { > > > > - smp_cpu_stop(1); > > > > - /* > > > > - * The smp library waits for the CPU to shut down, but > > > > let's > > > > - * also do it here, so we don't rely on the library > > > > - * implementation > > > > - */ > > > > - while (!smp_cpu_stopped(1)) {} > > > > - report_pass("stop"); > > > > + smp_cpu_stop_nowait(1); > > > > > > can it happen that the SIGP STOP order is accepted, but the > > > target > > > CPU > > > is still running (and not even busy)? > > > > Of course. A SIGP that's processed by userspace (which is many of > > them) > > injects a STOP IRQ back to the kernel, which means the CPU might > > not be > > stopped for some time. But... > > > > > > > > > + report(smp_cpu_stopped(1), "stop"); > > > > > > e.g. can this ^ check race with the actual stopping of the CPU? > > > > ...the smp_cpu_stopped() routine now loops on the CC2 that SIGP > > SENSE > > returns because of that pending IRQ. If SIGP SENSE returns CC0/1, > > then > > the CPU can correctly be identified stopped/operating, and the test > > can > > correctly pass/fail. > > my question was: is it possible architecturally that there is a > window > where the STOP order is accepted, but a SENSE on the target CPU still > successfully returns that the CPU is running? Not to my knowledge. The "Conditions Determining Response" section of POPS (v12, page 4-95; also below) specifically states that the SIGP SENSE will return CC2 when a SIGP STOP order is outstanding. In our implementation, the stop IRQ will be injected before the STOP order gets accepted, such that a SIGP SENSE would see it immediately. """ A previously issued start, stop, restart, stop- and-store-status, set-prefix, store-status-at- address order, or store-additional-status-at- address has been accepted by the addressed CPU, and execution of the func- tion requested by the order has not yet been completed. ... If the currently specified order is sense, external call, emergency signal, start, stop, restart, stop and store status, set prefix, store status at address, set architecture, set multithreading, or store additional status at address, then the order is rejected, and condition code 2 is set. """ > > in other words: is it specified architecturally that, once an order > is > accepted for a target CPU, that CPU can't accept any other order (and > will return CC2), including SENSE, until the order has been completed > successfully? The POPS quote I placed above excludes store-extended-status-address, conditional-emergency-signal, and sense-running-status orders as returning a CC2. That's something Janosch and I have chatted about offline, but don't have an answer to at this time. Besides that, any other order would get a CC2 until the STOP has been completed. > > > > > > > > } > > > > > > > > static void test_stop_store_status(void)