On 05/06/2013 07:03:14 PM, Benjamin Herrenschmidt wrote:
On Mon, 2013-05-06 at 18:53 -0500, Scott Wood wrote:
>
> > Ie. The last stage of entry will hard enable, so they should be
> > soft-enabled too... if not, latency trackers will consider the
whole
> > guest periods as "interrupt disabled"...
>
> OK... I guess we already have that problem on 32-bit as well?
32-bit doesn't do lazy disable, so the situation is a lot easier
there.
Right, but it still currently enters the guest with interrupts marked
as disabled, so we'd have the same latency tracker issue.
Another problem is that hard_irq_disable() doesn't call
trace_hardirqs_off()... We might want to fix that:
static inline void hard_irq_disable(void)
{
__hard_irq_disable();
if (get_paca()->soft_enabled)
trace_hardirqs_off();
get_paca()->soft_enabled = 0;
get_paca()->irq_happened |= PACA_IRQ_HARD_DIS;
}
Is it possible there are places that assume the current behavior?
> We also don't want PACA_IRQ_HARD_DIS to be cleared the way
> prep_irq_for_idle() does, because that's what lets the
> local_irq_enable() do the hard-enabling after we exit the guest.
Then set it again. Don't leave the kernel in a state where
soft_enabled
is 1 and irq_happened is non-zero. It might work in the specific KVM
case we are looking at now because we know we are coming back via KVM
exit and putting things right again but it's fragile, somebody will
come
back and break it, etc...
KVM is a pretty special case -- at least on booke, it's required that
all exits from guest state go through the KVM exception code. I think
it's less likely that that changes, than something breaks in the code
to fix up lazy ee state (especially since we've already seen the latter
happen).
I'll give it a shot, though.
If necessary, create (or improve existing) helpers that do the right
state adjustement. The cost of a couple of byte stores is negligible,
I'd rather you make sure everything remains in sync at all times.
My concern was mainly about complexity -- it seemed simpler to just say
that the during guest execution, CPU is in a special state that is not
visible to anything that cares about lazy EE. The fact that EE can
actually be *off* and we still take the interrupt supports its
specialness. :-)
-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html