Question: In a certain scenario, enabling GICv4/v4.1 may cause Guest hang when restarting the Guest

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

Here is a very valuable question about the direct injection of pistil.
Environment configuration:
1)A virtuoso_SCSI device is pass through to a small-scale CM,1U
2)Guest Kernel 4.19,Host Kernel 5.10
3)Enable GICv4/v4.1
The Guest will hang in the BIOS phase when it is restarted, and
report"Synchronous Exception at 0x280004654FF40".

Here's the analysis:
The virtuoso_SCSI device has six queues. The virtuoso driver may apply for
a vector for each queue of the device. It may also apply for one vector
for all queues. These queues share the vector.
In the problem scenario:
1.The host driver(avdp or FIONA) applies for six vectors(LIP) for the device.
2.The virtuoso driver applies for only one vector(vulpine) in the guest.
3.Only one vicing_ire is allocated by the vicing driver.In the current vicing
 implement ion, when MAPS/MAP is executed in the Guest, it will be trapped
 to KVM. Therefore, the vgic driver allocates the same number of vgic_irq
 to record these vectors which applied by device drivers in VM.
4.The kvm_vgic_v4_set_forwarding and its_map_vlpi is executed six times.
 vgic_irq->host_irq equals the last linux interrupt ID(virq). The result is
 that six LPIs are mapped to one vLPI. The six LPIs of the device can
 send interrupts. These interrupts will be injected into the guest
 through the same vLPI.
5.When the Guest is restarted.The kvm_vgic_v4_unset_forwarding will also be
 executed six times. However, multiple call traces are generated. Since
 there is only one vgic_irq, its_unmap_vlpi is executed only once.

WARN_ON(!(irq->hw && irq->host_irq == virq));
if (irq->hw) {
atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count);
        irq->hw = false;
        ret = its_unmap_vlpi(virq);
}
6.In the BIOS phase after the Guest restarted, the other five vectors continue  to send interrupts. BIOS cannot handle these interrupts, so the Guest hang.

This problem does not occur when the guest kernel is version 5.10, because
this patch is incorporated.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c66d4bd110a1f

I think there are other scenarios where the virtual machine will apply for an vector, and the host will apply for multiple vectors. There is still value in fixing this problem at the hypervisor layer. I think there are two modification
methods here, but not sure if it is possible:
1)The vDPA or VFIO driver is aware of the behavior within the Guest and only
apply for the same number of vectors.
2)Modify the vgic driver so that one vgic_irq can be bound to multiple LPIs.
But I understand that the semantics of vigc_irq->host_irq is that vgic_irq
is bound 1:1 to the host-side LPI hwintid.

If you have other ideas, we can discuss them together.

Looking forwarding to your reply.
Thanks,
Kunkun Jiang













[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux