Re: Calxeda Midway crashes on boot with KVM on 3.14-rc

Andre Przywara <osp@xxxxxxxxx> · Thu, 20 Feb 2014 16:13:03 +0100

On Thu, 20 Feb 2014 10:54:53 +0000
Marc Zyngier <marc.zyngier@xxxxxxx> wrote:

> On 20/02/14 10:14, Marc Zyngier wrote:
> > Hi Lorenzo,
> > 
> > On 20/02/14 09:47, Lorenzo Pieralisi wrote:
> >> On Wed, Feb 19, 2014 at 09:34:21PM +0000, Rob Herring wrote:
> >>> On Wed, Feb 19, 2014 at 9:41 AM, Lorenzo Pieralisi
> >>
> >> [...]
> >>
> >>>> diff --git a/drivers/cpuidle/cpuidle-calxeda.c
> >>>> b/drivers/cpuidle/cpuidle-calxeda.c index 6e51114..5034f7a 100644
> >>>> --- a/drivers/cpuidle/cpuidle-calxeda.c
> >>>> +++ b/drivers/cpuidle/cpuidle-calxeda.c
> >>>> @@ -41,9 +41,12 @@ static int calxeda_pwrdown_idle(struct
> >>>> cpuidle_device *dev, struct cpuidle_driver *drv,
> >>>>                                 int index)
> >>>>  {
> >>>> +       int ret;
> >>>> +
> >>>>         cpu_pm_enter();
> >>>> -       cpu_suspend(0, calxeda_idle_finish);
> >>>> -       cpu_pm_exit();
> >>>> +       ret = cpu_suspend(0, calxeda_idle_finish);
> >>>> +       if (!ret)
> >>>> +               cpu_pm_exit();
> >>>
> >>> It seems a little strange that the enter does not have to be
> >>> balanced with an exit call. Couldn't the enter tear down things
> >>> that need to be re-enabled?
> >>
> >> That's why I mentioned it is not a proper fix, just wanted to
> >> check my assumptions. There are multiple solutions to this
> >> problem, but the only one which is proper IMHO is for KVM to
> >> either "reset registers" on cpu_pm_enter (CPU_PM_ENTER in
> >> hyp_init_cpu_pm_notifier), or to avoid executing the hyp init code
> >> if a proper registers settings is detected when the cpu_pm_exit is
> >> triggered (CPU_PM_EXIT).
> >>
> >> Is there a way for KVM to simply detect it has nothing to do (ie,
> >> detect hyp mode is already initialized and it has nothing to do) ?
> >>
> >> There is no way in the kernel to guarantee the processor has been
> >> reset, I could infer that from return paths (ie cpu_suspend
> >> returning 0) but that's still not correct.
> > 
> > We really need to distinguish two cases:
> > - CPU is coming back with its registers intact: nothing to do.
> > - CPU is coming back after having being reset and is coming back
> > from the secondary boot path: we need to reinstall the KVM vectors
> > and stack.
> > 
> >>From EL1, there is absolutely no chance we can guess what is the
> >>state
> > of EL2. So I guess there is only two solutions:
> > - Get an indication that the CPU was reset or not: can we actually
> > get that information? I have the idea of a per-cpu variable
> > indicating if we're coming from reset of not, but that feels
> > tricky. A hypercall indicating the state of the HYP mode (free:
> > re-init, engaged: do nothing) feels slightly better.
> > - Always deactivate KVM on PM_ENTER: bloody expensive (requires
> > switching EL2 to some form of idmap, deactivate the MMU, reinstall
> > the original vectors), specially if we already know that this is
> > useless.
> > 
> > So I'd favour the first solution. Thoughts?
> 
> A quick and dirty, fully untested patch:
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 1d8248e..bd18bb8 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -878,7 +878,8 @@ static int hyp_init_cpu_pm_notifier(struct
> notifier_block *self, unsigned long cmd,
>  				    void *v)
>  {
> -	if (cmd == CPU_PM_EXIT) {
> +	if (cmd == CPU_PM_EXIT &&
> +	    __hyp_get_vectors() == hyp_default_vectors) {
>  		cpu_init_hyp_mode(NULL);
>  		return NOTIFY_OK;
>  	}
> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> index ddc1553..25e8a47 100644
> --- a/arch/arm/kvm/interrupts.S
> +++ b/arch/arm/kvm/interrupts.S
> @@ -363,6 +363,10 @@ hyp_hvc:
>  host_switch_to_hyp:
>  	pop	{r0, r1, r2}
>  
> +	cmp	r0, #-1
> +	mrceq	p15, 4, r0, c12, c0, 0	@ get HVBAR
> +	beq	eret_host
> +
>  	push	{lr}
>  	mrs	lr, SPSR
>  	push	{lr}
> @@ -378,6 +382,7 @@ THUMB(	orr	lr, #1)
>  	pop	{lr}
>  	msr	SPSR_csxf, lr
>  	pop	{lr}
> +eret_host:
>  	eret
>  
>  guest_trap:
> 
> 
> 
> Andre, can you please back out Lorenzo's patch and test
> this one instead?

Marc, Lorenzo:

So both patches work. I had to disable my debug messages, though - they
are so many that they are blocking progress. So I cannot really say
what's going on now, but definitely still many suspends/resumes, but
the system looks fine and KVM seems to work.

I looked into the PSCI firmware code, indeed it does not do anything
except a wfi and returning -3 if only this core should be suspended.
There is code to power-down the whole cluster (all 4 cores) through the
management processor (including cache disabling and stuff), but there
is no real power-gating or even state/cache changes if just a single
core should be handled. In fact I wonder if this is wrong in the
firmware since it returns to lr instead to the address in r2 (even for
state type 1).

Rob, can you confirm this?

Regards,
Andre.
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm