Since there are two issues competing here, there's no reason to reorder these. On Wed, Nov 8, 2017 at 1:24 PM, Steve Rutherford <srutherford@xxxxxxxxxx> wrote: > If an eoi on one cpu for irq x races with a guest reconfiguring the > redir entry that points to irq x, the reconfiguration might writeback > the high remote irr value that it read out in the first place. This > would leave the remote irr stuck high since there is no pending eoi > waiting to clear that. > > On Wed, Nov 8, 2017 at 1:52 AM, Nikita Leshchenko > <nikita.leshchenko@xxxxxxxxxx> wrote: >> >>> On 8 Nov 2017, at 2:16, Steve Rutherford <srutherford@xxxxxxxxxx> wrote: >>> >>> This is cleaner. Thanks for doing this. (Note that this is racy >>> without the read only remote irr change, so I have a slight preference >>> that swap the order of these patches.) >>> Reviewed-by: Steve Rutherford <srutherford@xxxxxxxxxx> >>> >> Thanks for your review. >> >> Can you explain why this is racy? >> >> I think that swapping the order of patches 4 (clearing remote irr) and >> 5 (making remote irr read only) will break Xen (and other guests that >> do explicit EOI) between the patches. Even though the current code is >> not semantically correct regarding remote irr behavior, it still makes >> explicit EOI work. If we swap the patches and apply patch 5 first, >> explicit EOI will break until we apply patch 4. >> >> Nikita >>> On Sun, Nov 5, 2017 at 7:30 PM, Wanpeng Li <kernellwp@xxxxxxxxx> wrote: >>>> 2017-11-05 21:52 GMT+08:00 Nikita Leshenko <nikita.leshchenko@xxxxxxxxxx>: >>>>> Some OSes (Linux, Xen) use this behavior to clear the Remote IRR bit for >>>>> IOAPICs without an EOI register. They simulate the EOI message manually >>>>> by changing the trigger mode to edge and then back to level, with the >>>>> entry being masked during this. >>>>> >>>>> QEMU implements this feature in commit ed1263c363c9 >>>>> ("ioapic: clear remote irr bit for edge-triggered interrupts") >>>>> >>>>> As a side effect, this commit removes an incorrect behavior where Remote >>>>> IRR was cleared when the redirection table entry was rewritten. This is not >>>>> consistent with the manual and also opens an opportunity for a strange >>>>> behavior when a redirection table entry is modified from an interrupt >>>>> handler that handles the same entry: The modification will clear the >>>>> Remote IRR bit even though the interrupt handler is still running. >>>>> >>>>> Signed-off-by: Nikita Leshenko <nikita.leshchenko@xxxxxxxxxx> >>>>> Reviewed-by: Liran Alon <liran.alon@xxxxxxxxxx> >>>>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> >>>> >>>> Reviewed-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >>>> >>>>> --- >>>>> arch/x86/kvm/ioapic.c | 11 ++++++++++- >>>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c >>>>> index 6df150eaaa78..163d340ee5f8 100644 >>>>> --- a/arch/x86/kvm/ioapic.c >>>>> +++ b/arch/x86/kvm/ioapic.c >>>>> @@ -304,8 +304,17 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val) >>>>> } else { >>>>> e->bits &= ~0xffffffffULL; >>>>> e->bits |= (u32) val; >>>>> - e->fields.remote_irr = 0; >>>>> } >>>>> + >>>>> + /* >>>>> + * Some OSes (Linux, Xen) assume that Remote IRR bit will >>>>> + * be cleared by IOAPIC hardware when the entry is configured >>>>> + * as edge-triggered. This behavior is used to simulate an >>>>> + * explicit EOI on IOAPICs that don't have the EOI register. >>>>> + */ >>>>> + if (e->fields.trig_mode == IOAPIC_EDGE_TRIG) >>>>> + e->fields.remote_irr = 0; >>>>> + >>>>> mask_after = e->fields.mask; >>>>> if (mask_before != mask_after) >>>>> kvm_fire_mask_notifiers(ioapic->kvm, KVM_IRQCHIP_IOAPIC, index, mask_after); >>>>> -- >>>>> 2.13.3 >>>>> >>