On Thu, Sep 28, 2017 at 10:07:13AM +0200, Benjamin Herrenschmidt wrote: > On Thu, 2017-09-28 at 11:45 +1000, David Gibson wrote: > > On Tue, Sep 26, 2017 at 04:47:04PM +1000, Sam Bobroff wrote: > > > In KVM's XICS-on-XIVE emulation, kvmppc_xive_get_xive() returns the > > > value of state->guest_server as "server". However, this value is not > > > set by it's counterpart kvmppc_xive_set_xive(). When the guest uses > > > this interface to migrate interrupts away from a CPU that is going > > > offline, it sees all interrupts as belonging to CPU 0, so they are > > > left assigned to (now) offline CPUs. > > > > > > This patch removes the guest_server field from the state, and returns > > > act_server in it's place (that is, the CPU actually handling the > > > interrupt, which may differ from the one requested). > > > > > > Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE > > > interrupt controller") > > > Cc: stable@xxxxxxxxxxxxxxx > > > Signed-off-by: Sam Bobroff <sam.bobroff@xxxxxxxxxxx> > > > --- > > > The other obvious way to patch this would be to set state->guest_server in > > > kvmppc_xive_set_xive() and that does also work because act_server is usually > > > equal to guest_server. > > > > > > However, in the cases where guest_server differed from act_server, the guest > > > would only move IRQs correctly if it got act_server (the CPU actually handling > > > the interrupt) here. So, that approach seemed better. > > > > Paolo, again this is a pretty urgent fix for KVM on Power and Paulus > > is away. We're hoping BenH will ack shortly (he's the logical > > technical reviewer), after which can you merge this direct into the > > KVM staging tree? (RHBZ 1477391, and we suspect several more are > > related). > > Acked-by: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx> > > As a subsequent cleanup we should probably rename act_server to server. > > Note: We know of a remaining theorical race that isn't fixed yet with > CPU unplug. If an interrupt is already in the queue of the CPU calling > xics_migrate_irqs_away (guest), then that irq never gets pulled out of > that queue and thus the bug this patch is fixing will re-occur. > > Fix isn't trivial, I'm working on it, though I'm tempted to make some > assumptions about how linux does things to keep it (much) simpler. > > I'll elaborate later (at Kernel Recipes right now) Paolo, Here's BenH's ack. Again, this is a pretty important fix for us, and Paulus is away. Can you take this into the KVM tree please. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Attachment:
signature.asc
Description: PGP signature