On Fri, 07 May 2021 12:02:57 +0100, Marc Zyngier <maz@xxxxxxxxxx> wrote: > > On Fri, 07 May 2021 10:58:23 +0100, > Shaokun Zhang <zhangshaokun@xxxxxxxxxxxxx> wrote: > > > > Hi Marc, > > > > Thanks for your quick reply. > > > > On 2021/5/7 17:03, Marc Zyngier wrote: > > > On Fri, 07 May 2021 06:57:04 +0100, > > > Shaokun Zhang <zhangshaokun@xxxxxxxxxxxxx> wrote: > > >> > > >> [This letter comes from Nianyao Tang] > > >> > > >> Hi, > > >> > > >> Using GICv4/4.1 and msi capability, guest vf driver requires 3 > > >> vectors and enable msi, will lead to guest stuck. > > > > > > Stuck how? > > > > Guest serial does not response anymore and guest network shutdown. > > > > > > > >> Qemu gets number of interrupts from Multiple Message Capable field > > >> set by guest. This field is aligned to a power of 2(if a function > > >> requires 3 vectors, it initializes it to 2). > > > > > > So I guess this is a MultiMSI device with 4 vectors, right? > > > > > > > Yes, it can support maximum of 32 msi interrupts, and vf driver only use 3 msi. > > > > >> However, guest driver just sends 3 mapi-cmd to vits and 3 ite > > >> entries is recorded in host. Vfio initializes msi interrupts using > > >> the number of interrupts 4 provide by qemu. When it comes to the > > >> 4th msi without ite in vits, in irq_bypass_register_producer, > > >> producer and consumer will __connect fail, due to find_ite fail, and > > >> do not resume guest. > > > > > > Let me rephrase this to check that I understand it: > > > - The device has 4 vectors > > > - The guest only create mappings for 3 of them > > > - VFIO calls kvm_vgic_v4_set_forwarding() for each vector > > > - KVM doesn't have a mapping for the 4th vector and returns an error > > > - VFIO disable this 4th vector > > > > > > Is that correct? If yes, I don't understand why that impacts the guest > > > at all. From what I can see, vfio_msi_set_vector_signal() just prints > > > a message on the console and carries on. > > > > > > > function calls: > > --> vfio_msi_set_vector_signal > > --> irq_bypass_register_producer > > -->__connect > > > > in __connect, add_producer finally calls kvm_vgic_v4_set_forwarding > > and fails to get the 4th mapping. When add_producer fail, it does > > not call cons->start, calls kvm_arch_irq_bypass_start and then > > kvm_arm_resume_guest. > > [+Eric, who wrote the irq_bypass infrastructure.] > > Ah, so the guest is actually paused, not in a livelock situation > (which is how I interpreted "stuck"). > > I think we should handle this case gracefully, as there should be no > expectation that the guest will be using this interrupt. Given that > VFIO seems to be pretty unfazed when a producer fails, I'm temped to > do the same thing and restart the guest. > > Also, __disconnect doesn't care about errors, so why should __connect > have this odd behaviour? > > Can you please try this? It is completely untested (and I think the > del_consumer call is odd, which is why I've also dropped it). > > Eric, what do you think? Adding Zhu, Jason, MST to the party. It all seems to be caused by this commit: commit a979a6aa009f3c99689432e0cdb5402a4463fb88 Author: Zhu Lingshan <lingshan.zhu@xxxxxxxxx> Date: Fri Jul 31 14:55:33 2020 +0800 irqbypass: do not start cons/prod when failed connect If failed to connect, there is no need to start consumer nor producer. Signed-off-by: Zhu Lingshan <lingshan.zhu@xxxxxxxxx> Suggested-by: Jason Wang <jasowang@xxxxxxxxxx> Link: https://lore.kernel.org/r/20200731065533.4144-7-lingshan.zhu@xxxxxxxxx Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> Zhu, I'd really like to understand why you think it is OK not to restart consumer and producers when a connection has failed to be established between the two? In the case of KVM/arm64, this results in the guest being forever suspended and never resumed. That's obviously not an acceptable regression, as there is a number of benign reasons for a connect to fail. Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm