On Mon, Jan 06, 2020 at 11:02:16AM +0000, Alexandru Elisei wrote: > Hi, > > On 1/3/20 3:31 PM, Andrew Jones wrote: > > On Thu, Jan 02, 2020 at 06:11:21PM +0000, Andre Przywara wrote: > >> On Tue, 31 Dec 2019 16:09:37 +0000 > >> Alexandru Elisei <alexandru.elisei@xxxxxxx> wrote: > >> > >> Hi, > >> > >>> The psci test performs a series of CPU_ON/CPU_OFF cycles for CPU 1. This is > >>> done by setting the entry point for the CPU_ON call to the physical address > >>> of the C function cpu_psci_cpu_die. > >>> > >>> The compiler is well within its rights to use the stack when generating > >>> code for cpu_psci_cpu_die. > >> I am a bit puzzled: Is this an actual test failure at the moment? Or just a potential problem? Because I see it using the stack pointer in the generated code in lib/arm/psci.o. But the psci test seems to pass. Or is that just because the SP is somehow not cleared, because of some KVM implementation specifics? > > I think the test just doesn't care that the CPU is in an infinite > > exception loop. Indeed we should probably put the CPU into an > > infinite loop instead of attempting to call PSCI die with it, as > > the status of a dying or dead CPU may conflict with our expected > > status of the test. > > I don't like this idea, I'll explain below why. > > >> One more thing below ... > >> > >>> However, because no stack initialization has > >>> been done, the stack pointer is zero, as set by KVM when creating the VCPU. > >>> This causes a data abort without a change in exception level. The VBAR_EL1 > >>> register is also zero (the KVM reset value for VBAR_EL1), the MMU is off, > >>> and we end up trying to fetch instructions from address 0x200. > >>> > >>> At this point, a stage 2 instruction abort is generated which is taken to > >>> KVM. KVM interprets this as an instruction fetch from an I/O region, and > >>> injects a prefetch abort into the guest. Prefetch abort is a synchronous > >>> exception, and on guest return the VCPU PC will be set to VBAR_EL1 + 0x200, > >>> which is... 0x200. The VCPU ends up in an infinite loop causing a prefetch > >>> abort while fetching the instruction to service the said abort. > >>> > >>> cpu_psci_cpu_die is basically a wrapper over the HVC instruction, so > >>> provide an assembly implementation for the function which will serve as the > >>> entry point for CPU_ON. > >>> > >>> Signed-off-by: Alexandru Elisei <alexandru.elisei@xxxxxxx> > >>> --- > >>> arm/cstart.S | 7 +++++++ > >>> arm/cstart64.S | 7 +++++++ > >>> arm/psci.c | 5 +++-- > >>> 3 files changed, 17 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/arm/cstart.S b/arm/cstart.S > >>> index 2c81d39a666b..dfef48e4dbb2 100644 > >>> --- a/arm/cstart.S > >>> +++ b/arm/cstart.S > >>> @@ -7,6 +7,7 @@ > >>> */ > >>> #define __ASSEMBLY__ > >>> #include <auxinfo.h> > >>> +#include <linux/psci.h> > >>> #include <asm/thread_info.h> > >>> #include <asm/asm-offsets.h> > >>> #include <asm/pgtable-hwdef.h> > >>> @@ -139,6 +140,12 @@ secondary_entry: > >>> blx r0 > >>> b do_idle > >>> > >>> +.global asm_cpu_psci_cpu_die > >>> +asm_cpu_psci_cpu_die: > >>> + ldr r0, =PSCI_0_2_FN_CPU_OFF > >>> + hvc #0 > >>> + b . > >> I am wondering if this implementation is actually too simple. Both the current implementation and the kernel clear at least the first three arguments to 0. > >> I failed to find a requirement for doing this (nothing in the SMCCC or the PSCI spec), but I guess it would make sense when looking at forward compatibility. > >> > >> At the very least it's a change in behaviour (ignoring the missing printf). > >> So shall we just clear r1, r2 and r3 here? (Same for arm64 below) > > If we were to keep this function, then I agree we should zero the > > registers, but as I said above, I think the proper fix for this issue is > > Regarding zero'ing unused arguments, I've explained my point of view in my reply > to Andre. > > > to just not call PSCI die. Rather we should drop into an infinite loop, > > which also doesn't use the stack. Maybe something like this will work > > > > diff --git a/arm/psci.c b/arm/psci.c > > index 5c1accb6cea4..74c179d4976c 100644 > > --- a/arm/psci.c > > +++ b/arm/psci.c > > @@ -79,10 +79,14 @@ static void cpu_on_secondary_entry(void) > > cpumask_set_cpu(cpu, &cpu_on_ready); > > while (!cpu_on_start) > > cpu_relax(); > > - cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(cpu_psci_cpu_die)); > > + cpu_on_ret[cpu] = psci_cpu_on(cpus[1], __pa(halt)); > > cpumask_set_cpu(cpu, &cpu_on_done); > > } > > > > +/* > > + * This test expects CPU1 to not have previously been boot, > > + * and when this test completes CPU1 will be stuck in halt. > > + */ > > static bool psci_cpu_on_test(void) > > { > > bool failed = false; > > @@ -104,7 +108,7 @@ static bool psci_cpu_on_test(void) > > cpu_on_start = 1; > > smp_mb(); > > > > - cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(cpu_psci_cpu_die)); > > + cpu_on_ret[0] = psci_cpu_on(cpus[1], __pa(halt)); > > cpumask_set_cpu(0, &cpu_on_done); > > > > while (!cpumask_full(&cpu_on_done)) > > Trying to turn a CPU on and off concurrently from separate threads seems like a > nifty test to me, and good for catching possible races. With your version, 2 > threads are enough to get all possible results: success and CPU already on. > > Besides that, if my memory serves me right, this exact test was the reason for a > refactoring of the VCPU reset code, in commit 03fdfb269009 ("KVM: arm64: Don't > write junk to sysregs on reset"). It wasn't because there was a CPU_OFF involved. Two simultaneous CPU_ON's was enough. The first, successful CPU_ON reset the target VCPU causing the second CPU_ON to fail to look it up by MPIDR, resulting in PSCI_RET_INVALID_PARAMS. > > My preference would be to keep calling CPU_ON and CPU_OFF repeatedly. > Your analysis showed we never called CPU_OFF, because CPU1 got stuck in an exception loop. So we haven't actually done a repeated ON/OFF test yet. That's not a bad idea, but I also like a test that ensures a single successful CPU_ON, like the patch below would ensure if we halted instead of CPU_OFF'ed. It should be possible to have both tests if we do the CPU_OFF test first. Thanks, drew diff --git a/arm/psci.c b/arm/psci.c index 5c1accb6cea4..8a24860bde28 100644 --- a/arm/psci.c +++ b/arm/psci.c @@ -86,7 +86,7 @@ static void cpu_on_secondary_entry(void) static bool psci_cpu_on_test(void) { bool failed = false; - int cpu; + int cpu, ret_success = 0; cpumask_set_cpu(1, &cpu_on_ready); cpumask_set_cpu(1, &cpu_on_done); @@ -113,7 +113,12 @@ static bool psci_cpu_on_test(void) for_each_present_cpu(cpu) { if (cpu == 1) continue; - if (cpu_on_ret[cpu] != PSCI_RET_SUCCESS && cpu_on_ret[cpu] != PSCI_RET_ALREADY_ON) { + if (cpu_on_ret[cpu] == PSCI_RET_SUCCESS) { + if (++ret_success > 1) { + report_info("more than one cpu_on success"); + failed = true; + } + } else if (cpu_on_ret[cpu] != PSCI_RET_ALREADY_ON) { report_info("unexpected cpu_on return value: caller=CPU%d, ret=%d", cpu, cpu_on_ret[cpu]); failed = true; }