On 02.04.20 17:12, David Hildenbrand wrote: > On 02.04.20 13:02, Christian Borntraeger wrote: >> make sure that sigp sense running status returns a sane value for >> stopped CPUs. To avoid potential races with the stop being processed we >> wait until sense running status is first 0. >> >> Signed-off-by: Christian Borntraeger <borntraeger@xxxxxxxxxx> >> --- >> lib/s390x/smp.c | 2 +- >> lib/s390x/smp.h | 2 +- >> s390x/smp.c | 13 +++++++++++++ >> 3 files changed, 15 insertions(+), 2 deletions(-) >> >> diff --git a/lib/s390x/smp.c b/lib/s390x/smp.c >> index 5ed8b7b..492cb05 100644 >> --- a/lib/s390x/smp.c >> +++ b/lib/s390x/smp.c >> @@ -58,7 +58,7 @@ bool smp_cpu_stopped(uint16_t addr) >> return !!(status & (SIGP_STATUS_CHECK_STOP|SIGP_STATUS_STOPPED)); >> } >> >> -bool smp_cpu_running(uint16_t addr) >> +bool smp_sense_running_status(uint16_t addr) >> { >> if (sigp(addr, SIGP_SENSE_RUNNING, 0, NULL) != SIGP_CC_STATUS_STORED) >> return true; >> diff --git a/lib/s390x/smp.h b/lib/s390x/smp.h >> index a8b98c0..639ec92 100644 >> --- a/lib/s390x/smp.h >> +++ b/lib/s390x/smp.h >> @@ -40,7 +40,7 @@ struct cpu_status { >> int smp_query_num_cpus(void); >> struct cpu *smp_cpu_from_addr(uint16_t addr); >> bool smp_cpu_stopped(uint16_t addr); >> -bool smp_cpu_running(uint16_t addr); >> +bool smp_sense_running_status(uint16_t addr); >> int smp_cpu_restart(uint16_t addr); >> int smp_cpu_start(uint16_t addr, struct psw psw); >> int smp_cpu_stop(uint16_t addr); >> diff --git a/s390x/smp.c b/s390x/smp.c >> index 79cdc1f..b4b1ff2 100644 >> --- a/s390x/smp.c >> +++ b/s390x/smp.c >> @@ -210,6 +210,18 @@ static void test_emcall(void) >> report_prefix_pop(); >> } >> >> +static void test_sense_running(void) >> +{ >> + report_prefix_push("sense_running"); >> + /* make sure CPU is stopped */ >> + smp_cpu_stop(1); >> + /* wait for stop to succeed. */ >> + while(smp_sense_running_status(1)); >> + report(!smp_sense_running_status(1), "CPU1 sense claims not running"); >> + report_prefix_pop(); >> +} >> + >> + >> /* Used to dirty registers of cpu #1 before it is reset */ >> static void test_func_initial(void) >> { >> @@ -319,6 +331,7 @@ int main(void) >> test_store_status(); >> test_ecall(); >> test_emcall(); >> + test_sense_running(); >> test_reset(); >> test_reset_initial(); >> smp_cpu_destroy(1); >> > > TBH, I am still not sure if this is completely free of races. > > Assume CPU 1 is in handle_stop() > > if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) > kvm_s390_vcpu_stop(vcpu); > // CPU 1: gets scheduled out. > // CPU 0: while(smp_sense_running_status(1)); finishes > // CPU 1: gets scheduled in to return to user space > return -EOPNOTSUPP; > // CPU 0: report(!smp_sense_running_status(1), "CPU1 sense claims not > running"); fails > > SIGP SENSE RUNNING is simply racy as hell and doesn't give you any > guarantees. Which is good enough for some performance improvements > (e.g., spinlocks). > > Now, I can queue this, but I wouldn't be surprised if we see random > failures at one point. Which would speak for Janoschs variant. Loop until non running at least once and then report success?