Re: [PATCH 2/4] KVM: x86/xen: Compatibility fixes for shared runstate area

Sean Christopherson <seanjc@xxxxxxxxxx> · Wed, 23 Nov 2022 19:32:11 +0000

On Wed, Nov 23, 2022, David Woodhouse wrote:
> On Wed, 2022-11-23 at 18:21 +0000, Sean Christopherson wrote:
> > On Sat, Nov 19, 2022, David Woodhouse wrote:
> ...
> 		/* When invoked from kvm_sched_out() we cannot sleep */
> 		if (atomic)
> 			return;
> ...
> > > +		/*
> > > +		 * Use kvm_gpc_activate() here because if the runstate
> > > +		 * area was configured in 32-bit mode and only extends
> > > +		 * to the second page now because the guest changed to
> > > +		 * 64-bit mode, the second GPC won't have been set up.
> > > +		 */
> > > +		if (kvm_gpc_activate(v->kvm, gpc2, NULL, KVM_HOST_USES_PFN,
> > > +				     gpc1->gpa + user_len1, user_len2))
> > 
> > I believe kvm_gpc_activate() needs to be converted from write_lock_irq() to
> > write_lock_irqsave() for this to be safe.
> 
> Hm, not sure I concur. You're only permitted to call kvm_gpc_activate()
> in a context where you can sleep. Interrupts had better not be disabled
> already, and it had better not need write_lock_irqsave().
> 
> In this particular context, we do drop all locks before calling
> kvm_gpc_activate(), and we don't call it at all in the scheduling-out
> case when we aren't permitted to sleep.

Oh, duh, there's an "if (atomic)" check right above this.

> > Side topic, why do all of these flows disable IRQs?
> 
> The gpc->lock is permitted to be taken in IRQ context by the users of
> the GPC. For example we do this for the shared_info and vcpu_info when
> passing interrupts directly through to the guest via
> kvm_arch_set_irq_inatomic() → kvm_xen_set_evtchn_fast().
> 
> The code path is fairly convoluted but I do believe we get there
> directly from the VFIO IRQ handler, via the custom wakeup handler on
> the irqfd's waitqueue (irqfd_wakeup).

Yeah, I remember finding that flow.

> Now, perhaps a GPC user which knows that *it* is not going to use its
> GPC from IRQ context, could refrain from bothering to disable
> interrupts when taking its own gpc->lock. And that's probably true for
> the runstate area.

This is effectively what I was asking about.

> The generic GPC code still needs to disable interrupts when taking a
> gpc->lock though, unless we want to add yet another flag to control
> that behaviour.

Right.  Might be worth adding a comment at some point to call out that disabling
IRQs may not be strictly required for all users, but it's done for simplicity.
Ah, if/when we add kvm_gpc_lock(), that would be the perfect place to document
the behavior.

Thanks!