On 1/31/22 10:10 AM, Pierre Morel wrote:
On 1/14/22 21:38, Matthew Rosato wrote:Use the associated vfio feature ioctl to enable adapter event notificationand forwarding for devices when requested. This feature will be set up with or without firmware assist based upon the 'intassist' setting.It is a personal opinion but I do not like the term 'intassist'. Why not using 'gisa' ?
Well, I don't think 'gisa' is quite accurate, we're not toggling the delivery mechanism here, we're toggling who is responsible for doing the delivery (forwarding). The default (intassist=on) says firmware will do it if it can and if not, KVM will be tasked with delivery. intassist=off means KVM will always be told to do the delivery.
I would find it more relevant.If something for the profane admin then interrupt_assist or irq_assist or even int_assist.Or is it forwarding_assist ?
Yeah, perahps 'forwarding_assist' is the right phrase, as we are disabling firmware's ability to assist in the forwarding of guest interrupts.
Signed-off-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx> --- hw/s390x/s390-pci-bus.c | 24 ++++++++-- hw/s390x/s390-pci-inst.c | 54 +++++++++++++++++++++- hw/s390x/s390-pci-vfio.c | 79 ++++++++++++++++++++++++++++++++ include/hw/s390x/s390-pci-bus.h | 1 + include/hw/s390x/s390-pci-vfio.h | 20 ++++++++ 5 files changed, 173 insertions(+), 5 deletions(-) diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c index 66649af6e0..6ee70446ca 100644 --- a/hw/s390x/s390-pci-bus.c +++ b/hw/s390x/s390-pci-bus.c @@ -189,7 +189,10 @@ void s390_pci_sclp_deconfigure(SCCB *sccb) rc = SCLP_RC_NO_ACTION_REQUIRED; break; default: - if (pbdev->summary_ind) { + if (pbdev->interp) { + /* Interpreted devices were using interrupt forwarding */ + s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist); + } else if (pbdev->summary_ind) { pci_dereg_irqs(pbdev); } if (pbdev->iommu->enabled) {@@ -981,6 +984,11 @@ static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)return rc; } + rc = s390_pci_probe_aif(pbdev); + if (rc) { + return rc; + } + rc = s390_pci_update_passthrough_fh(pbdev); if (rc) { return rc;@@ -1076,6 +1084,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev, if (pbdev->interp && !s390_has_feat(S390_FEAT_ZPCI_INTERP)) { DPRINTF("zPCI interpretation facilities missing.\n");pbdev->interp = false; + pbdev->intassist = false; } if (pbdev->interp) { rc = s390_pci_interp_plug(s, pbdev);@@ -1090,11 +1099,13 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,if (!pbdev->interp) { /* Do vfio passthrough but intercept for I/O */ pbdev->fh |= FH_SHM_VFIO; + pbdev->intassist = false; } } else { pbdev->fh |= FH_SHM_EMUL; /* Always intercept emulated devices */ pbdev->interp = false; + pbdev->intassist = false; } if (s390_pci_msix_init(pbdev) && !pbdev->interp) { @@ -1244,7 +1255,10 @@ static void s390_pcihost_reset(DeviceState *dev) /* Process all pending unplug requests */ QTAILQ_FOREACH_SAFE(pbdev, &s->zpci_devs, link, next) { if (pbdev->unplug_requested) { - if (pbdev->summary_ind) { + if (pbdev->interp) {+ /* Interpreted devices were using interrupt forwarding */+ s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist); + } else if (pbdev->summary_ind) { pci_dereg_irqs(pbdev); } if (pbdev->iommu->enabled) {@@ -1382,7 +1396,10 @@ static void s390_pci_device_reset(DeviceState *dev)break; } - if (pbdev->summary_ind) { + if (pbdev->interp) { + /* Interpreted devices were using interrupt forwarding */ + s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist); + } else if (pbdev->summary_ind) { pci_dereg_irqs(pbdev); } if (pbdev->iommu->enabled) { @@ -1428,6 +1445,7 @@ static Property s390_pci_device_properties[] = { DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid), DEFINE_PROP_STRING("target", S390PCIBusDevice, target), DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true), + DEFINE_PROP_BOOL("intassist", S390PCIBusDevice, intassist, true),We allow to disable IRQ forwarding only for test or debug purpose don't we? Then shouldn't we make it explicit ?
Not sure I follow.This setting makes forwarding assist (intassist=on) an implicit default which can be subsequently overruled by an explicit 'intassist=off'.
So, the idea is that by default if interpretation is available and enabled we will also use the firmware assist to do interrupt forwarding -- This is the preferred setup (performance) and nothing needs to be explicitly specified. Explicitly specifying 'intassist=on' does nothing additional.
Conversely, if you want to disable the firmware forwarding (for as you say a test/debug scenario) only then must you explicitly specify 'intassist=off'.
Or were you instead suggesting to make the bool a default of false that must be explicity set to true? In that case, it would probably need a different name (e.g. implicit default is host_forwarding=off, override with host_forwarding=on and only when host_forwarding=on will we force host delivery of interrupts)
FWIW, I chose a default of 'true' because I preferred the idea of having to specify something negative like 'intassist=off' when forcing host delivery as this implies you are turning a feature off (because you are).
DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index e9a0dc12e4..121e07cc41 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -1111,6 +1111,46 @@ static void fmb_update(void *opaque) timer_mod(pbdev->fmb_timer, t + pbdev->pci_group->zpci_group.mui); } +static int mpcifc_reg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib) +{ + int rc; + + /* Interpreted devices must also use interrupt forwarding */ + rc = s390_pci_get_aif(pbdev, false, pbdev->intassist); + if (rc) { + DPRINTF("Bad interrupt forwarding state\n"); + return rc; + } + + rc = s390_pci_set_aif(pbdev, fib, true, pbdev->intassist); + if (rc) { + DPRINTF("Failed to enable interrupt forwarding\n"); + return rc; + } + + return 0; +} ++static int mpcifc_dereg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)+{ + int rc; + + /* Interpreted devices were using interrupt forwarding */ + rc = s390_pci_get_aif(pbdev, true, pbdev->intassist); + if (rc) { + DPRINTF("Bad interrupt forwarding state\n"); + return rc; + } + + rc = s390_pci_set_aif(pbdev, fib, false, pbdev->intassist); + if (rc) { + DPRINTF("Failed to disable interrupt forwarding\n"); + return rc; + } + + return 0; +} +int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,uintptr_t ra) {@@ -1165,7 +1205,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,switch (oc) { case ZPCI_MOD_FC_REG_INT: - if (pbdev->summary_ind) { + if (pbdev->interp) { + if (mpcifc_reg_int_interp(pbdev, &fib)) { + cc = ZPCI_PCI_LS_ERR; + s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE); + } + } else if (pbdev->summary_ind) { cc = ZPCI_PCI_LS_ERR; s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE); } else if (reg_irqs(env, pbdev, fib)) {@@ -1174,7 +1219,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,} break; case ZPCI_MOD_FC_DEREG_INT: - if (!pbdev->summary_ind) { + if (pbdev->interp) { + if (mpcifc_dereg_int_interp(pbdev, &fib)) { + cc = ZPCI_PCI_LS_ERR; + s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE); + } + } else if (!pbdev->summary_ind) { cc = ZPCI_PCI_LS_ERR; s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE); } else { diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c index 2cab3a9e89..73f3b3ed19 100644 --- a/hw/s390x/s390-pci-vfio.c +++ b/hw/s390x/s390-pci-vfio.c@@ -149,6 +149,85 @@ int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)return 0; } +int s390_pci_probe_aif(S390PCIBusDevice *pbdev) +{ + VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev); + struct vfio_device_feature feat = { + .argsz = sizeof(struct vfio_device_feature),+ .flags = VFIO_DEVICE_FEATURE_PROBE + VFIO_DEVICE_FEATURE_ZPCI_AIF+ }; + + return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat); +} + +int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable, + bool assist) +{ + VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev); + struct vfio_device_zpci_aif *data; + int size = sizeof(struct vfio_device_feature) + sizeof(*data); + g_autofree struct vfio_device_feature *feat = g_malloc0(size); + + feat->argsz = size;+ feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_AIF;+ + data = (struct vfio_device_zpci_aif *)&feat->data; + if (enable) { + data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT; + if (!pbdev->intassist) { + data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST; + } + /* Fill in the guest fib info */ + data->ibv = fib->aibv; + data->sb = fib->aisb; + data->noi = FIB_DATA_NOI(fib->data); + data->isc = FIB_DATA_ISC(fib->data); + data->sbo = FIB_DATA_AISBO(fib->data); + } else { + data->flags = 0; + } + + return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat); +} + +int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist) +{ + VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev); + struct vfio_device_zpci_aif *data; + int size = sizeof(struct vfio_device_feature) + sizeof(*data); + g_autofree struct vfio_device_feature *feat = g_malloc0(size); + int rc; + + feat->argsz = size;+ feat->flags = VFIO_DEVICE_FEATURE_GET + VFIO_DEVICE_FEATURE_ZPCI_AIF;+ + rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat); + if (rc) { + return rc; + } + + /* Determine if current interrupt settings match the host */ + data = (struct vfio_device_zpci_aif *)&feat->data; + if (enable && (!(data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT))) { + rc = -EINVAL;+ } else if (!enable && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT)) {+ rc = -EINVAL; + } + + /*+ * When enabled for interrupts, the assist and forced host-delivery are+ * mututally exclusive + */assist is unclear for me int_assist , forwarding_assist ?
Yes, my wording here is confusing. Here, 'enable' is meant to represent whether forwarding (AIF) is currently enabled. And then 'assist' is the intassist (forwarding_assist) setting. And my use of 'mutually exclusive' is just plain confusing and should be removed.
I think that otherwise the code is logically correct -- I'll re-visit this and change the variable names and/or improve the comments. To be clear, what we are trying to verify here is:
1) if qemu believes AIF is currently disabled, then kernel should also show the same (AIF_FLOAT is off)
2) if qemu believes AIF is already enabled, kernel should also show the same (AIF_FLOAT is on). Additionally, if qemu believes host delivery is forced (instassist=off) then kernel should also show this (AIF_HOST is on). This is where my really confusing use of 'mutally exclusive' comes in to play -- the AIF_HOST kernel state is an inversion of the QEMU state, so the values should never match -- if intassist=off then AIF_HOST should be on. If intassist=on then AIF_HOST should be off. But really the comment should just read that we are ensuring that kernel space indicates what we expect for the current state.