Since commit abafbc551fdd ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") VFIO now rejects guest MMIO access when the PCI_COMMAND_MEMORY (MSE) bit is OFF. This is however not the case for VFs (fixed in commit ebfa440ce38b ("vfio/pci: Fix SR-IOV VF handling with MMIO blocking")). Furthermore, on s390 where we always run with at least a bare-metal hypervisor (LPAR) PCI_COMMAND_MEMORY, unlike Device/ Vendor IDs and BARs, is not emulated when VFs are passed-through to the OS independently. Based upon Bjorn's most recent comment [1], I investigated the notion of setting is_virtfn=1 for VFs passed-through to Linux and not linked to a parent PF (referred to as a 'detached VF' in my prior post). However, we rapidly run into issues on how to treat an is_virtfn device with no linked PF. Further complicating the issue is when you consider the guest kernel has a passed-through VF but has CONFIG_PCI_IOV=n as in many locations is_virtfn checking is ifdef'd out altogether and the device is assumed to be an independent PCI function. The decision made by VFIO whether to require or emulate a PCI feature (in this case PCI_COMMAND_MEMORY) is based upon the knowledge it has about the device, including implicit expectations of what/is not emulated below VFIO. (ex: is it safe to read vendor/id from config space?) -- Our firmware layer attempts similar behavior by emulating things such as vendor/id/BAR access - without these an unlinked VF would not be usable. But what is or is not emulated by the layer below may be different based upon which entity is providing the emulation (vfio, LPAR, some other hypervisor) So, the proposal here aims to fix the immediate issue of s390 pass-through VFs becoming suddenly unusable by vfio by using a dev_flags bit to identify a VF feature that we know is hardwired to 0 for any VF (PCI_COMMAND_MEMORY) and de-coupling the need for emulating PCI_COMMAND_MEMORY from the is_virtfn flag. The exact scope of is_virtfn and physfn for bare-metal vs guest scenarios and identifying what features are / are not emulated by the lower-level hypervisors is a much bigger discussion independent of this limited proposal. Changes from v3: - Propose a dev_flags model for the MSE bit - Set the bit for typical iov linking - Also set the bit for s390 VFs (linked and unlinked) - Modify vfio-pci to look at the dev_flags bit instead of is_virtfn [1]: https://marc.info/?l=linux-pci&m=159856041930022&w=2 Matthew Rosato (3): PCI/IOV: Mark VFs as not implementing MSE bit s390/pci: Mark all VFs as not implementing MSE bit vfio/pci: Decouple MSE bit checks from is_virtfn arch/s390/pci/pci_bus.c | 5 +++-- drivers/pci/iov.c | 1 + drivers/vfio/pci/vfio_pci_config.c | 20 +++++++++++++------- include/linux/pci.h | 2 ++ 4 files changed, 19 insertions(+), 9 deletions(-) -- 1.8.3.1