RE: [PATCH 4/4 RESEND] PCI: hv: Propagate coherence from VMbus device to PCI device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Robin Murphy <robin.murphy@xxxxxxx> Sent: Friday, March 18, 2022 3:58 AM
> 
> On 2022-03-18 05:12, Michael Kelley (LINUX) wrote:
> > From: Robin Murphy <robin.murphy@xxxxxxx> Sent: Thursday, March 17, 2022
> 10:15 AM
> >>
> >> On 2022-03-17 16:25, Michael Kelley via iommu wrote:
> >>> PCI pass-thru devices in a Hyper-V VM are represented as a VMBus
> >>> device and as a PCI device.  The coherence of the VMbus device is
> >>> set based on the VMbus node in ACPI, but the PCI device has no
> >>> ACPI node and defaults to not hardware coherent.  This results
> >>> in extra software coherence management overhead on ARM64 when
> >>> devices are hardware coherent.
> >>>
> >>> Fix this by propagating the coherence of the VMbus device to the
> >>> PCI device.  There's no effect on x86/x64 where devices are
> >>> always hardware coherent.
> >>>
> >>> Signed-off-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>
> >>> ---
> >>>    drivers/pci/controller/pci-hyperv.c | 17 +++++++++++++----
> >>>    1 file changed, 13 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> >>> index ae0bc2f..14276f5 100644
> >>> --- a/drivers/pci/controller/pci-hyperv.c
> >>> +++ b/drivers/pci/controller/pci-hyperv.c
> >>> @@ -49,6 +49,7 @@
> >>>    #include <linux/refcount.h>
> >>>    #include <linux/irqdomain.h>
> >>>    #include <linux/acpi.h>
> >>> +#include <linux/dma-map-ops.h>
> >>>    #include <asm/mshyperv.h>
> >>>
> >>>    /*
> >>> @@ -2142,9 +2143,9 @@ static void hv_pci_remove_slots(struct hv_pcibus_device
> >> *hbus)
> >>>    }
> >>>
> >>>    /*
> >>> - * Set NUMA node for the devices on the bus
> >>> + * Set NUMA node and DMA coherence for the devices on the bus
> >>>     */
> >>> -static void hv_pci_assign_numa_node(struct hv_pcibus_device *hbus)
> >>> +static void hv_pci_assign_properties(struct hv_pcibus_device *hbus)
> >>>    {
> >>>    	struct pci_dev *dev;
> >>>    	struct pci_bus *bus = hbus->bridge->bus;
> >>> @@ -2167,6 +2168,14 @@ static void hv_pci_assign_numa_node(struct
> >> hv_pcibus_device *hbus)
> >>>    				     numa_map_to_online_node(
> >>>    					     hv_dev->desc.virtual_numa_node));
> >>>
> >>> +		/*
> >>> +		 * On ARM64, propagate the DMA coherence from the VMbus device
> >>> +		 * to the corresponding PCI device. On x86/x64, these calls
> >>> +		 * have no effect because DMA is always hardware coherent.
> >>> +		 */
> >>> +		dev_set_dma_coherent(&dev->dev,
> >>> +			dev_is_dma_coherent(&hbus->hdev->device));
> >>
> >> Eww... if you really have to do this, I'd prefer to see a proper
> >> hv_dma_configure() helper implemented and wired up to
> >> pci_dma_configure(). Although since it's a generic property I guess at
> >> worst pci_dma_configure could perhaps propagate coherency from the host
> >> bridge to its children by itself in the absence of any other firmware
> >> info. And it's built-in so could use arch_setup_dma_ops() like everyone
> >> else.
> >>
> >
> > I'm not seeing an existing mechanism to provide a "helper" or override
> > of pci_dma_configure().   Could you elaborate?  Or is this something
> > that needs to be created?
> 
> I mean something like the diff below (other #includes omitted for
> clarity). Essentially if VMBus has its own way of describing parts of
> the system, then for those parts it's nice if it could fit into the same
> abstractions we use for firmware-based system description.

OK, got it.  Adding the VMbus special case into pci_dma_configure()
would work.  But after investigating further, let me throw out two
other possible solutions as well.

1)  It's easy to make the top-level VMbus node in the DSDT be
the ACPI companion for the pci_host bridge.  The function
pcibios_root_bridge_prepare() will do this if the pci-hyperv.c
driver sets sysdata->parent to that top-level VMbus node.  Then
pci_dma_configure()works as is.  Also, commit 7d40c0f70 that
special cases pcibios_root_bridge_prepare() for Hyper-V becomes
unnecessary.   I've coded this approach and it seems to work, but
I don't know all the implications of making that VMbus node be
the ACPI companion, potentially for multiple pci_host bridges.

2)  Since there's no vIOMMU and we don't know how it will be
described if there should be one in the future, it's a bit premature
to try to set things up "correctly" now to handle it.  A simple
approach is to set  dma_default_coherent to true if the top-level
VMbus node in the DSDT indicates coherent.  No other changes are
needed.  If there's a vIOMMU at some later time, then we add the
use of arch_setup_dma_ops() for each device.

Any thoughts on these two approaches vs. adding the VMbus
special case into pci_dma_configure()?  My preference would be
to avoid adding that special case if one of the other two approaches
is reasonable.

Michael

> 
> Cheers,
> Robin.
> 
> ----->8-----
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 588588cfda48..7d92ccad1569 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -20,6 +20,7 @@
>   #include <linux/of_device.h>
>   #include <linux/acpi.h>
>   #include <linux/dma-map-ops.h>
> +#include <linux/hyperv.h>
>   #include "pci.h"
>   #include "pcie/portdrv.h"
> 
> @@ -1602,6 +1603,8 @@ static int pci_dma_configure(struct device *dev)
>   		struct acpi_device *adev = to_acpi_device_node(bridge->fwnode);
> 
>   		ret = acpi_dma_configure(dev, acpi_get_dma_attr(adev));
> +	} else if (is_vmbus_dev(bridge)) {
> +		ret = hv_dma_configure(dev, device_get_dma_attr(bridge));
>   	}
> 
>   	pci_put_host_bridge_device(bridge);
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index f565a8938836..d1d4dd3d5a3a 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -1764,4 +1764,19 @@ static inline unsigned long virt_to_hvpfn(void *addr)
>   #define HVPFN_DOWN(x)	((x) >> HV_HYP_PAGE_SHIFT)
>   #define page_to_hvpfn(page)	(page_to_pfn(page) *
> NR_HV_HYP_PAGES_IN_PAGE)
> 
> +static inline bool is_vmbus_dev(struct device *dev)
> +{
> +	/*
> +	 * dev->bus == &hv_bus would break when the caller is built-in
> +	 * and CONFIG_HYPERV=m, so look for it by name instead.
> +	 */
> +	return !strcmp(dev->bus->name, "vmbus");
> +}
> +
> +static inline int hv_dma_configure(struct device *dev, enum
> dev_dma_attr attr)
> +{
> +	arch_setup_dma_ops(dev, 0, 0, NULL, attr == DEV_DMA_COHERENT);
> +	return 0;
> +}
> +
>   #endif /* _HYPERV_H */




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux