Re: [PATCH] kvmtool: don't use PCI config space IRQ line field

Will Deacon <will.deacon@xxxxxxx> · Tue, 16 Jun 2015 18:06:57 +0100

On Mon, Jun 15, 2015 at 11:45:38AM +0100, Andre Przywara wrote:
> On 06/05/2015 05:41 PM, Will Deacon wrote:
> > On Thu, Jun 04, 2015 at 04:20:45PM +0100, Andre Przywara wrote:
> >> In PCI config space there is an interrupt line field (offset 0x3f),
> >> which is used to initially communicate the IRQ line number from
> >> firmware to the OS. _Hardware_ should never use this information,
> >> as the OS is free to write any information in there.
> >> But kvmtool uses this number when it triggers IRQs in the guest,
> >> which fails starting with Linux 3.19-rc1, where the PCI layer starts
> >> writing the virtual IRQ number in there.
> >>
> >> Fix that by storing the IRQ number in a separate field in
> >> struct virtio_pci, which is independent from the PCI config space
> >> and cannot be influenced by the guest.
> >> This fixes ARM/ARM64 guests using PCI with newer kernels.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>
> >> ---
> >>  include/kvm/virtio-pci.h | 8 ++++++++
> >>  virtio/pci.c             | 9 ++++++---
> >>  2 files changed, 14 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h
> >> index c795ce7..b70cadd 100644
> >> --- a/include/kvm/virtio-pci.h
> >> +++ b/include/kvm/virtio-pci.h
> >> @@ -30,6 +30,14 @@ struct virtio_pci {
> >>  	u8			isr;
> >>  	u32			features;
> >>  
> >> +	/*
> >> +	 * We cannot rely on the INTERRUPT_LINE byte in the config space once
> >> +	 * we have run guest code, as the OS is allowed to use that field
> >> +	 * as a scratch pad to communicate between driver and PCI layer.
> >> +	 * So store our legacy interrupt line number in here for internal use.
> >> +	 */
> >> +	u8			legacy_irq_line;
> >> +
> >>  	/* MSI-X */
> >>  	u16			config_vector;
> >>  	u32			config_gsi;
> >> diff --git a/virtio/pci.c b/virtio/pci.c
> >> index 7556239..e17e5a9 100644
> >> --- a/virtio/pci.c
> >> +++ b/virtio/pci.c
> >> @@ -141,7 +141,7 @@ static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 p
> >>  		break;
> >>  	case VIRTIO_PCI_ISR:
> >>  		ioport__write8(data, vpci->isr);
> >> -		kvm__irq_line(kvm, vpci->pci_hdr.irq_line, VIRTIO_IRQ_LOW);
> >> +		kvm__irq_line(kvm, vpci->legacy_irq_line, VIRTIO_IRQ_LOW);
> >>  		vpci->isr = VIRTIO_IRQ_LOW;
> >>  		break;
> >>  	default:
> >> @@ -299,7 +299,7 @@ int virtio_pci__signal_vq(struct kvm *kvm, struct virtio_device *vdev, u32 vq)
> >>  			kvm__irq_trigger(kvm, vpci->gsis[vq]);
> >>  	} else {
> >>  		vpci->isr = VIRTIO_IRQ_HIGH;
> >> -		kvm__irq_trigger(kvm, vpci->pci_hdr.irq_line);
> >> +		kvm__irq_trigger(kvm, vpci->legacy_irq_line);
> >>  	}
> >>  	return 0;
> >>  }
> >> @@ -323,7 +323,7 @@ int virtio_pci__signal_config(struct kvm *kvm, struct virtio_device *vdev)
> >>  			kvm__irq_trigger(kvm, vpci->config_gsi);
> >>  	} else {
> >>  		vpci->isr = VIRTIO_PCI_ISR_CONFIG;
> >> -		kvm__irq_trigger(kvm, vpci->pci_hdr.irq_line);
> >> +		kvm__irq_trigger(kvm, vpci->legacy_irq_line);
> >>  	}
> >>  
> >>  	return 0;
> >> @@ -422,6 +422,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
> >>  	if (r < 0)
> >>  		goto free_msix_mmio;
> >>  
> >> +	/* save the IRQ that device__register() has allocated */
> >> +	vpci->legacy_irq_line = vpci->pci_hdr.irq_line;
> > 
> > I'd rather we used the container_of trick that we do for virtio-mmio
> > devices when assigning the irq in device__register. Then we can avoid
> > this line completely.
> 
> Not completely sure I get what you mean, I take it you want to assign
> legacy_irq_line in pci__assign_irq() directly (where the IRQ number is
> allocated).
> But this function is PCI generic code and is used by the VESA
> framebuffer and the shmem device on x86 as well. For those devices
> dev_hdr is not part of a struct virtio_pci, so we can't do container_of
> to assign the legacy_irq_line here directly.
> Admittedly this fix should apply to the other two users as well, but
> VESA does not use interrupts and pci-shmem is completely broken anyway,
> so I didn't bother to fix it in this regard.
> Would it be justified to provide an IRQ number field in struct
> device_header to address all users?
> 
> Or what am I missing here?

If VESA and shmem are broken, they should either be fixed or removed.

If you fix them, then we could have separate virtual buses for virtio-pci
and emulated-pci (or whatever you want to call it). We could also have
a separate bus for passthrough-devices too.

However, that's quite a lot of work for a bug-fix, so I guess the easiest
thing is to extend your current hack to cover VESA and shmem too.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html