Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table

Gleb Natapov <gleb@xxxxxxxxxxxxx> · Thu, 28 Nov 2013 11:20:35 +0200



On Thu, Nov 28, 2013 at 09:14:22AM +0000, Zhanghaoyu (A) wrote:
> >>>>>      No, this would be exactly the same code that is running now:
> >>>>>
> >>>>>              mutex_lock(&kvm->irq_lock);
> >>>>>              old = kvm->irq_routing;
> >>>>>              kvm_irq_routing_update(kvm, new);
> >>>>>              mutex_unlock(&kvm->irq_lock);
> >>>>>
> >>>>>              synchronize_rcu();
> >>>>>              kfree(old);
> >>>>>              return 0;
> >>>>>
> >>>>>      Except that the kfree would run in the call_rcu kernel thread instead of
> >>>>>      the vcpu thread.  But the vcpus already see the new routing table after
> >>>>>      the rcu_assign_pointer that is in kvm_irq_routing_update.
> >>>>>
> >>>>> I understood the proposal was also to eliminate the 
> >>>>> synchronize_rcu(), so while new interrupts would see the new 
> >>>>> routing table, interrupts already in flight could pick up the old one.
> >>>> Isn't that always the case with RCU?  (See my answer above: "the 
> >>>> vcpus already see the new routing table after the rcu_assign_pointer 
> >>>> that is in kvm_irq_routing_update").
> >>> With synchronize_rcu(), you have the additional guarantee that any 
> >>> parallel accesses to the old routing table have completed.  Since we 
> >>> also trigger the irq from rcu context, you know that after
> >>> synchronize_rcu() you won't get any interrupts to the old destination 
> >>> (see kvm_set_irq_inatomic()).
> >> We do not have this guaranty for other vcpus that do not call 
> >> synchronize_rcu(). They may still use outdated routing table while a 
> >> vcpu or iothread that performed table update sits in synchronize_rcu().
> >>
> >
> >Consider this guest code:
> >
> >   write msi entry, directing the interrupt away from this vcpu
> >   nop
> >   memset(&idt, 0, sizeof(idt));
> >
> >Currently, this code will never trigger a triple fault.  With the change to call_rcu(), it may.
> >
> >Now it may be that the guest does not expect this to work (PCI writes are posted; and interrupts can be delayed indefinitely by the pci fabric), 
> >but we don't know if there's a path that guarantees the guest something that we're taking away with this change.
> 
> In native environment, if a CPU's LAPIC's IRR and ISR have been pending many interrupts, then OS perform zeroing this CPU's IDT before receiving interrupts,
> will the same problem happen?
> 
This is just an example. OS can ensure that there is no other pending interrupt by some other means.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html