On Wed, Nov 21, 2018 at 12:17:45PM +0000, Julien Thierry wrote: > > > On 21/11/18 11:06, Christoffer Dall wrote: > >Hi, > > > >On Wed, Nov 21, 2018 at 04:56:54PM +0800, peng.hao2@xxxxxxxxxx wrote: > >>>On 19/11/2018 09:10, Mark Rutland wrote: > >>>>On Sat, Nov 17, 2018 at 10:58:37AM +0800, peng.hao2@xxxxxxxxxx wrote: > >>>>>>On 16/11/18 00:23, peng.hao2@xxxxxxxxxx wrote: > >>>>>>>>Hi, > >>>>>>>>>When virtual machine starts, hang up. > >>>>>>>> > >>>>>>>>I take it you mean the *guest* hangs? Because it doesn't get a timer > >>>>>>>>interrupt? > >>>>>>>> > >>>>>>>>>The kernel version of guest > >>>>>>>>>is 4.16. Host support vgic_v3. > >>>>>>>> > >>>>>>>>Your host kernel is something recent, I guess? > >>>>>>>> > >>>>>>>>>It was mainly due to the incorrect vgic_irq's(intid=27) group value > >>>>>>>>>during injection interruption. when kvm_vgic_vcpu_init is called, > >>>>>>>>>dist is not initialized at this time. Unable to get vgic V3 or V2 > >>>>>>>>>correctly, so group is not set. > >>>>>>>> > >>>>>>>>Mmh, that shouldn't happen with (v)GICv3. Do you use QEMU (which > >>>>>>>>version?) or some other userland tool? > >>>>>>>> > >>>>>>> > >>>>>>>QEMU emulator version 3.0.50 . > >>>>>>> > >>>>>>>>>group is setted to 1 when vgic_mmio_write_group is invoked at some > >>>>>>>>>time. > >>>>>>>>>when irq->group=0 (intid=27), No ICH_LR_GROUP flag was set and > >>>>>>>>>interrupt injection failed. > >>>>>>>>> > >>>>>>>>>Signed-off-by: Peng Hao <peng.hao2@xxxxxxxxxx> > >>>>>>>>>--- > >>>>>>>>> virt/kvm/arm/vgic/vgic-v3.c | 2 +- > >>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>>>>>>> > >>>>>>>>>diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c > >>>>>>>>>index 9c0dd23..d101000 100644 > >>>>>>>>>--- a/virt/kvm/arm/vgic/vgic-v3.c > >>>>>>>>>+++ b/virt/kvm/arm/vgic/vgic-v3.c > >>>>>>>>>@@ -198,7 +198,7 @@ void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, > >>>>>>>>>struct vgic_irq *irq, int lr) if (vgic_irq_is_mapped_level(irq) && > >>>>>>>>>(val & ICH_LR_PENDING_BIT)) irq->line_level = false; > >>>>>>>>> > >>>>>>>>>- if (irq->group) > >>>>>>>>>+ if (model == KVM_DEV_TYPE_ARM_VGIC_V3) > >>>>>>>> > >>>>>>>>This is not the right fix, not only because it basically reverts the > >>>>>>>>GICv3 part of 87322099052 (KVM: arm/arm64: vgic: Signal IRQs using > >>>>>>>>their configured group). > >>>>>>>> > >>>>>>>>Can you try to work out why kvm_vgic_vcpu_init() is apparently called > >>>>>>>>before dist->vgic_model is set, also what value it has? > >>>>>>>>If I understand the code correctly, that shouldn't happen for a GICv3. > >>>>>>>> > >>>>>>>Even if the value of group is correctly assigned in kvm_vgic_vcpu_init, the group is then written 0 through vgic_mmio_write_group. > >>>>>>> If the interrupt comes at this time, the interrupt injection fails. > >>>>>> > >>>>>>Does that mean that the guest is configuring its interrupts as Group0? > >>>>>>That sounds wrong, Linux should configure all it's interrupts as > >>>>>>non-secure group1. > >>>>> > >>>>>no, I think that uefi dose this, not linux. > >>>>>1. kvm_vgic_vcpu_init > >>>>>2. vgic_create > >>>>>3. kvm_vgic_dist_init > >>>>>4.vgic_mmio_write_group: uefi as guest, write group=0 > >>>>>5.vgic_mmio_write_group: linux as guest, write group=1 > >>>> > >>>>Is this the same issue fixed by EDK2 commit: > >>>> > >>>>66127011a544b90e ("ArmPkg/ArmGicDxe ARM: fix encoding for GICv3 interrupt acknowledge") > >>>> > >>>>... where EDK2 would try to use IAR0 rather than IAR1? > >>>> > >>>>The commit messages notes this lead to a boot-time hang. > >>> > >>>I managed to trigger an issue with a really old EFI implementation that > >>>doesn't configure its interrupts as Group1, and yet tries to ACK its > >>>interrupts using the Group1 accessor. Guess what? It is not going to work. > >>> > >>>Commit c7fefb690661f2e38afcb8200bd318ecf38ab961 in the edk2 tree seems > >>>to be the fix (I only assume it does, I haven't actually checked). A > >>>recent build, as found in Debian Buster, works perfectly (tested with > >>>both QEMU v2.12 and tip of tree). > >>> > >>>Now, I really don't get what you're saying about Linux not getting > >>>interrupts. How do you get to booting Linux if EFI is not making any > >>>forward progress? Are you trying them independently? > >>> > >>I start linux with bypassing uefi, the print info is the same. > >>[507107.748908] vgic_mmio_write_group:## intid/27 group=0 > >>[507107.752185] vgic_mmio_write_group:## intid/27 group=0 > >>[507107.899566] vgic_mmio_write_group:## intid/27 group=1 > >>[507107.907370] vgic_mmio_write_group:## intid/27 group=1 > >>the command line is like this: > >>/home/qemu-patch/linshi/qemu/aarch64-softmmu/qemu-system-aarch64 -machine virt-3.1,accel=kvm,usb=off,dump-guest-core=off,gic-version=3 -kernel /home/kernelboot/vmlinuz-4.16.0+ -initrd /home/kernelboot/initramfs-4.16.0+.img -append root=/dev/mapper/cla-root ro crashkernel=auto rd.lvm.lv=cla/root rd.lvm.lv=cla/swap.UTF-8 -drive file=/home/centos74-ph/boot.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -vnc 0.0.0.0:0 -k en-us -device virtio-gpu-pci,id=video0,max_outputs=1,bus=pci.3,addr=0x0 -device pvpanic-mmio -msg timestamp=on > >> > >>This is strange. That's probably the Linux 4.16 kernel setting group to 0, and I'll continue to track in guest. > > > >Could you try the following patch: > > > >diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c > >index c0c0b88af1d5..6fa858c7a5a6 100644 > >--- a/virt/kvm/arm/vgic/vgic-init.c > >+++ b/virt/kvm/arm/vgic/vgic-init.c > >@@ -231,13 +231,6 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu) > > irq->config = VGIC_CONFIG_LEVEL; > > } > >- /* > >- * GICv3 can only be created via the KVM_DEVICE_CREATE API and > >- * so we always know the emulation type at this point as it's > >- * either explicitly configured as GICv3, or explicitly > >- * configured as GICv2, or not configured yet which also > >- * implies GICv2. > >- */ > > if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) > > irq->group = 1; > > else > >@@ -298,6 +291,16 @@ int vgic_init(struct kvm *kvm) > > if (ret) > > goto out; > >+ /* Initialize groups on CPUs created before the VGIC type was known */ > >+ kvm_for_each_vcpu(i, vcpu, kvm) { > >+ struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; > >+ > >+ for (i = 0; i < VGIC_NR_PRIVATE_IRQS; i++) { > >+ struct vgic_irq *irq = &vgic_cpu->private_irqs[i]; > >+ irq->group = 1; > >+ } > >+ } > >+ > > if (vgic_has_its(kvm)) { > > ret = vgic_v4_init(kvm); > > if (ret) > > > > > > If the value of dist->vgic_model is not properly initialized at the time we > call kvm_vgic_vcpu_init is call, won't we still overwrite the irq->group > when we get there? I don't understand this question. When we get where? > (I still haven't seen why we could consider > dist->vgic_model is initialized at that point). Because there is no enforced ordering between creating VCPUs and creating the VGIC. > > Shouldn't we do the irq->group initialization somewhere in > kvm_arch_vcpu_ioctl_vcpu_init? (with some vgic_* function to encapsulate it > of course). At that point I believe we know that dist->vgic_model is > initialized. > See above. AFAICT we don't have a single point at which we can do everything because creation of the two components can be interleaved. We could hook into kvm_vcpu_first_run_init(), but then userspace can observe uninitialized values if it looks at the GIC state prior to running the VCPUs, which is also not great. In other words, I think the problem is that you can do: create_vcpu(0); create_vgic(v3); create_vcpu(2); Now you'll have vcpu0 have its private IRQs set to group 0, and you'll have vcpu1 have its private IRQs set to group 1 (prior to my patch). Am I missing something? Thanks, Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm