> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Sent: Friday, December 10, 2021 8:13 PM > > >> 5) It's not possible for the kernel to reliably detect whether it is > >> running on bare metal or not. Yes we talked about heuristics, but > >> that's something I really want to avoid. > > > > How would the hypercall mechanism avoid such heuristics? > > The availability of IR remapping where the irqdomain which is provided > by the remapping unit signals that it supports this new scheme: > > |--IO/APIC > |--MSI > vector -- IR --|--MSI-X > |--IMS > > while the current scheme is: > > |--IO/APIC > vector -- IR --|--PCI/MSI[-X] > > or > > |--IO/APIC > vector --------|--PCI/MSI[-X] > > So in the new scheme the IR domain will advertise new features which are > not available on older kernels. The availability of these new features > is the indicator for the interrupt subsystem and subsequently for PCI > whether IMS is supported or not. > > Bootup either finds an IR unit or not. In the bare metal case that's the > usual hardware/firmware detection. In the guest case it's the > availability of vIR including the required hypercall protocol. Given we have vIR already, there are three scenarios: 1) Bare metal: IR (no hypercall, for sure) 2) VM: vIR (no hypercall, today) 3) VM: vIR (hypercall, tomorrow) IMS should be allowed only for 1) and 3). But how to differentiate 2) from 1) if no guest heuristics? btw I checked Qemu history to find vIR was introduced in 2016: commit 1121e0afdcfa0cd40e36bd3acff56a3fac4f70fd Author: Peter Xu <peterx@xxxxxxxxxx> Date: Thu Jul 14 13:56:13 2016 +0800 x86-iommu: introduce "intremap" property Adding one property for intel-iommu devices to specify whether we should support interrupt remapping. By default, IR is disabled. To enable it, we should use (take Intel IOMMU as example): -device intel_iommu,intremap=on This property can be shared by Intel and future AMD IOMMUs. Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> Reviewed-by: Michael S. Tsirkin <mst@xxxxxxxxxx> Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> > > > Then Qemu needs to find out the GSI number for the vIRTE handle. > > Again Qemu doesn't have such information since it doesn't know > > which MSI[-X] entry points to this handle due to no trap. > > > > This implies that we may also need carry device ID, #msi entry, etc. > > in the hypercall, so Qemu can associate the virtual routing info > > to the right [irqfd, gsi]. > > > > In your model the hypercall is raised by IR domain. Do you see > > any problem of finding those information within IR domain? > > IR has the following information available: > > Interrupt type > - MSI: Device, index and number of vectors > - MSI-X: Device, index > - IMS: Device, index > > Target APIC/vector pair > > IMS: The index depends on the storage type: > > For storage in device memory, e.g. IDXD, it's the array index. > > For storage in system memory, the index is a software artifact. > > Does that answer your question? > Yes. Thanks Kevin