Re: [PATCH v2 0/2] iommu/intel: Quirk non-compliant PCIe-to-PCI bridges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2013-07-08 at 13:34 -0600, Bjorn Helgaas wrote:
> On Mon, Jul 08, 2013 at 11:07:20AM -0600, Alex Williamson wrote:
> > Joerg,
> > 
> > Where do we stand on this series?  You had a concern that the heuristic
> > used in patch 1/ could be dangerous.  The suggestion for detecting the
> > issue was actually from Bjorn who replied with his rationale.  Do you
> > want to go in the direction of a fixed whitelist or do you agree that
> > even if the heuristic breaks it provides better behavior than what we
> > have now?  Thanks,
> 
> I'm trying to take a step back and look at the overall design, not
> these specific patches.
> 
> IOMMUs translate addresses based on their source, i.e., a PCIe
> requester ID.  This is made more complicated by the fact that some
> bridges "take ownership" (change the requester ID as they forward
> transactions upstream), as well as the fact that conventional PCI has
> no requester ID at all.  And some broken devices apparently generate
> DMA requests using the requester ID of another device.
> 
> We currently deal with this using the pci_find_upstream_pcie_bridge()
> and pci_get_dma_source() interfaces, but I think there's too much
> assembly required by their users.  pci_find_upstream_pcie_bridge()
> callers normally loop through all the bridges between the "upstream
> PCIe bridge" and the device, checking for bridges that might take
> ownership.  They probably also ought to use pci_get_dma_source() to
> account for the broken devices, but most callers don't.
> 
> Most of this is PCI-specific stuff that should be of interest to all
> IOMMU drivers, and the overall structure of calls and looping should
> be the same for all of them, so it would be nice to factor it out
> somehow.
> 
> The attached patch is guaranteed not to even compile; it's just to
> make this idea more concrete.  The basic idea is that since the IOMMU
> driver wants to perform some action for each possible requester ID the
> IOMMU might see, PCI could provide an iterator
> ("pci_for_each_requester_id()") to do that.
> 
> Bjorn
> 
> 
> commit afad51492c6672b96c2b0735600d5695e30f7180
> Author: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> Date:   Wed Jul 3 16:04:26 2013 -0600
> 
>     pci-add-for-each-requester-id
> 
> diff --git a/drivers/pci/search.c b/drivers/pci/search.c
> index d0627fa..380eb03 100644
> --- a/drivers/pci/search.c
> +++ b/drivers/pci/search.c
> @@ -17,6 +17,89 @@
>  DECLARE_RWSEM(pci_bus_sem);
>  EXPORT_SYMBOL_GPL(pci_bus_sem);
>  
> +#define PCI_REQUESTER_ID(dev)	(((dev)->bus->number << 8) | (dev)->devfn)
> +#define PCI_BRIDGE_REQUESTER_ID(bridge)	((bridge)->subordinate->number << 8)
> +
> +static inline bool pci_is_pcix(struct pci_dev *dev)
> +{
> +	return !!pci_pcix_cap(dev);	/* XXX not implemented */
> +}
> +
> +static bool pci_bridge_may_take_ownership(struct pci_dev *bridge)
> +{
> +	/*
> +	 * A PCIe to PCI/PCI-X bridge may take ownership per PCIe Bridge
> +	 * Spec v1.0, sec 2.3.
> +	 */
> +	if (pci_is_pcie(bridge) &&
> +	    pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE)
> +		return true;

Assume we still need a quirk here, PCIe-to-PCI bridges without a PCIe
capability are exactly what causes the current bug.

> +
> +	/*
> +	 * A PCI-X to PCI-X bridge need not take ownership because there
> +	 * are requester IDs on the secondary PCI-X bus.  However, if a PCI
> +	 * device is added on the secondary bus, the bridge must revert to
> +	 * being a PCI-X to PCI bridge, and it then *would* take ownership.
> +	 * Assuming a PCI-X to PCI-X bridge takes ownership means we can
> +	 * tolerate a future hot-add without having to change existing
> +	 * IOMMU mappings.
> +	 */
> +	if (pci_is_pcix(bridge))
> +		return true;
> +
> +	return false;
> +}
> +
> +static struct pci_dev *pci_bridge_to_dev(struct pci_bus *bus,
> +					 struct pci_dev *dev)
> +{
> +	struct pci_dev *bridge;
> +	struct pci_bus *child_bus;
> +	u8 secondary, subordinate, busn = dev->bus->number;
> +
> +	if (dev->bus == bus)
> +		return dev;
> +
> +	/*
> +	 * There may be several devices on "bus".  Find the one that is a
> +	 * bridge leading to "dev".
> +	 */
> +	list_for_each_entry(bridge, &bus->devices, bus_list) {
> +		child_bus = bridge->subordinate;
> +		if (child_bus) {
> +			secondary = child_bus->busn_res.start;
> +			subordinate = child_bus->busn_res.end;
> +			if (secondary <= busn && busn <= subordinate)
> +				return bridge;
> +		}
> +	}
> +	return NULL;
> +}
> +
> +int pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev,
> +			      int (*fn)(struct pci_dev *, void *),


u16 requester_id

> +			      void *data)
> +{
> +	int ret;
> +
> +	dev = pci_get_dma_source(dev);	/* XXX ref count screwup */

It's unfortunate that our dma quirk still seems to only get called for
the end device where we could have bridges that need it too.

> +
> +	while (bridge != dev) {


I'm having a hard time understanding this function since bridge is never
defined in any of the below sample code.  Would bridge be the root port
above a device?  And what about devices connected to the RC?  Seems like
we'd need to traverse up the bus first, so there's a bit of missing
magic.

> +		if (pci_bridge_may_take_ownership(bridge)) {
> +			ret = fn(dev, PCI_BRIDGE_REQUESTER_ID(bridge), data);
> +			if (ret)
> +				return ret;
> +		}
> +		bridge = pci_bridge_to_dev(bridge->subordinate, dev);
> +	}
> +
> +	if (pci_is_pcie(dev) || pci_is_pcix(dev))
> +		return fn(dev, PCI_REQUESTER_ID(dev), data);
> +
> +	/* Conventional PCI has no requester ID */
> +	return 0;
> +}
> +
>  /*
>   * find the upstream PCIe-to-PCI bridge of a PCI device
>   * if the device is PCIE, return NULL
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 3a24e4f..cbcc82f 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -794,6 +794,8 @@ static inline struct pci_dev *pci_get_bus_and_slot(unsigned int bus,
>  }
>  struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from);
>  int pci_dev_present(const struct pci_device_id *ids);
> +void pci_for_each_requester_id(struct pci_dev *bridge, struct pci_dev *dev,
> +			       int (*fn)(struct pci_dev *, void *), void *data);
>  
>  int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn,
>  			     int where, u8 *val);
> 
> commit 6e770d7bf4ef95207df4ac4c42ef4406ff7bc54e
> Author: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> Date:   Wed Jul 3 12:02:15 2013 -0600
> 
>     intel-iommu-upstream-pcie

I assume this one would is also going to apply dma quirks to dma_ops,
which I'm all for.

> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index b91564d..bdb6e6a 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -1669,79 +1669,68 @@ static int domain_context_mapping_one(struct dmar_domain *domain, int segment,
>  	return 0;
>  }
>  
> +struct domain_context_info {
> +	struct dmar_domain *domain;
> +	struct pci_dev *dev;
> +	int translation;
> +	struct intel_iommu *iomumu;

s/iomumu/iommu/

> +	bool mapped;
> +};
> +
> +static int context_mapping(struct pci_dev *dev, u16 requester_id, void *data)
> +{
> +	struct domain_context_info *dev_info = data;
> +	int segment = pci_domain_nr(data->dev);

dev_info->dev

> +	u8 bus = data->requester_id >> 8;

s/data->//

> +	u8 devfn = data->requester_id & 0xff;

s/data->//

> +
> +	return domain_context_mapping_one(data->domain, segment,
> +					  bus, devfn, data->translation);
> +}
> +
>  static int domain_context_mapping(struct dmar_domain *domain,
>  				  struct pci_dev *pdev, int translation)
>  {
> -	int ret;
> -	struct pci_dev *tmp, *parent;
> +	struct domain_context_info data;
> +	struct pci_dev *bridge;
>  
> -	ret = domain_context_mapping_one(domain, pci_domain_nr(pdev->bus),
> -					 pdev->bus->number, pdev->devfn,
> -					 translation);
> -	if (ret)
> -		return ret;
> +	data.domain = domain;
> +	data.dev = pdev;
> +	data.translation = translation;
> +	bridge = xx;

Darn, I was hoping you were going to reveal where bridge comes from.

> +	return pci_for_each_requester_id(bridge, pdev, context_mapping, &data);
> +}
>  
> -	/* dependent device mapping */
> -	tmp = pci_find_upstream_pcie_bridge(pdev);
> -	if (!tmp)
> -		return 0;
> -	/* Secondary interface's bus number and devfn 0 */
> -	parent = pdev->bus->self;
> -	while (parent != tmp) {
> -		ret = domain_context_mapping_one(domain,
> -						 pci_domain_nr(parent->bus),
> -						 parent->bus->number,
> -						 parent->devfn, translation);
> -		if (ret)
> -			return ret;
> -		parent = parent->bus->self;
> -	}
> -	if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */
> -		return domain_context_mapping_one(domain,
> -					pci_domain_nr(tmp->subordinate),
> -					tmp->subordinate->number, 0,
> -					translation);
> -	else /* this is a legacy PCI bridge */
> -		return domain_context_mapping_one(domain,
> -						  pci_domain_nr(tmp->bus),
> -						  tmp->bus->number,
> -						  tmp->devfn,
> -						  translation);
> +static int is_context_mapped(struct pci_dev *dev, u16 requester_id, void *data)
> +{
> +	struct domain_context_info *dev_info = data;
> +	u8 bus = data->requester_id >> 8;
> +	u8 devfn = data->requester_id & 0xff;
> +	bool ret;
> +
> +	ret = device_context_mapped(dev_info->iommu, bus, devfn);
> +	if (!ret)
> +		dev_info->mapped = false;
> +
> +	return 0;
>  }
>  
>  static bool domain_context_mapped(struct pci_dev *pdev)
>  {
> -	bool ret;
> -	struct pci_dev *tmp, *parent;
> +	struct domain_context_info data;
>  	struct intel_iommu *iommu;
> +	struct pci_dev *bridge;
>  
>  	iommu = device_to_iommu(pci_domain_nr(pdev->bus), pdev->bus->number,
>  				pdev->devfn);
>  	if (!iommu)
>  		return false;
>  
> -	ret = device_context_mapped(iommu, pdev->bus->number, pdev->devfn);
> -	if (!ret)
> -		return false;
> -	/* dependent device mapping */
> -	tmp = pci_find_upstream_pcie_bridge(pdev);
> -	if (!tmp)
> -		return true;
> -	/* Secondary interface's bus number and devfn 0 */
> -	parent = pdev->bus->self;
> -	while (parent != tmp) {
> -		ret = device_context_mapped(iommu, parent->bus->number,
> -					    parent->devfn);
> -		if (!ret)
> -			return false;
> -		parent = parent->bus->self;
> -	}
> -	if (pci_is_pcie(tmp))
> -		return device_context_mapped(iommu, tmp->subordinate->number,
> -					     0);
> -	else
> -		return device_context_mapped(iommu, tmp->bus->number,
> -					     tmp->devfn);
> +	data.iommu = iommu;
> +	data.mapped = true;
> +	bridge = xx;
> +	pci_for_each_requester_id(bridge, pdev, is_context_mapped, &data);
> +	return data.mapped;
>  }
>  
>  /* Returns a number of VTD pages, but aligned to MM page size */
> @@ -1961,6 +1950,27 @@ static struct dmar_domain *find_domain(struct pci_dev *pdev)
>  	return NULL;
>  }
>  
> +static int find_domain2(struct pci_dev *dev, u16 requester_id, void *data)
> +{
> +	struct domain_context_info *dev_info = data;
> +	int segment = pci_domain_nr(data->dev);
> +	u8 bus = data->requester_id >> 8;
> +	u8 devfn = data->requester_id & 0xff;
> +	unsigned long flags;
> +	struct device_domain_info *info;
> +
> +	spin_lock_irqsave(&device_domain_lock, flags);
> +	list_for_each_entry(info, &device_domain_list, global) {
> +		if (info->segment == segment &&
> +		    info->bus == bus && info->devfn == devfn) {
> +			data->domain = info->domain;
> +			break;
> +		}
> +	}
> +	spin_unlock_irqrestore(&device_domain_lock, flags);
> +	return 0;
> +}
> +
>  /* domain is initialized */
>  static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw)
>  {
> @@ -1968,11 +1978,11 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw)
>  	struct intel_iommu *iommu;
>  	struct dmar_drhd_unit *drhd;
>  	struct device_domain_info *info, *tmp;
> -	struct pci_dev *dev_tmp;
>  	unsigned long flags;
>  	int bus = 0, devfn = 0;
>  	int segment;
>  	int ret;
> +	struct domain_context_info data;
>  
>  	domain = find_domain(pdev);
>  	if (domain)
> @@ -1980,29 +1990,13 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw)
>  
>  	segment = pci_domain_nr(pdev->bus);
>  
> -	dev_tmp = pci_find_upstream_pcie_bridge(pdev);
> -	if (dev_tmp) {
> -		if (pci_is_pcie(dev_tmp)) {
> -			bus = dev_tmp->subordinate->number;
> -			devfn = 0;
> -		} else {
> -			bus = dev_tmp->bus->number;
> -			devfn = dev_tmp->devfn;
> -		}
> -		spin_lock_irqsave(&device_domain_lock, flags);
> -		list_for_each_entry(info, &device_domain_list, global) {
> -			if (info->segment == segment &&
> -			    info->bus == bus && info->devfn == devfn) {
> -				found = info->domain;
> -				break;
> -			}
> -		}
> -		spin_unlock_irqrestore(&device_domain_lock, flags);
> -		/* pcie-pci bridge already has a domain, uses it */
> -		if (found) {
> -			domain = found;
> -			goto found_domain;
> -		}
> +	data.domain = NULL;
> +	data.dev = pdev;
> +	bridge = xx;
> +	pci_for_each_requester_id(bridge, pdev, find_domain2, &data);
> +	if (data.domain) {
> +		domain = data.domain;
> +		goto found_domain;
>  	}
>  
>  	domain = alloc_domain();
> @@ -3734,31 +3728,29 @@ int __init intel_iommu_init(void)
>  	return 0;
>  }
>  
> +static int detach_requester(struct pci_dev *dev, u16 requester_id, void *data)
> +{
> +	struct domain_context_info *dev_info = data;
> +	struct intel_iommu *iommu = data->iommu;
> +	u8 bus = data->requester_id >> 8;
> +	u8 devfn = data->requester_id & 0xff;
> +
> +	iommu_detach_dev(iommu, bus, devfn);
> +	return 0;
> +}
> +
>  static void iommu_detach_dependent_devices(struct intel_iommu *iommu,
>  					   struct pci_dev *pdev)
>  {
> -	struct pci_dev *tmp, *parent;
> +	struct domain_context_info data;
>  
>  	if (!iommu || !pdev)
>  		return;
>  
>  	/* dependent device detach */
> -	tmp = pci_find_upstream_pcie_bridge(pdev);
> -	/* Secondary interface's bus number and devfn 0 */
> -	if (tmp) {
> -		parent = pdev->bus->self;
> -		while (parent != tmp) {
> -			iommu_detach_dev(iommu, parent->bus->number,
> -					 parent->devfn);
> -			parent = parent->bus->self;
> -		}
> -		if (pci_is_pcie(tmp)) /* this is a PCIe-to-PCI bridge */
> -			iommu_detach_dev(iommu,
> -				tmp->subordinate->number, 0);
> -		else /* this is a legacy PCI bridge */
> -			iommu_detach_dev(iommu, tmp->bus->number,
> -					 tmp->devfn);
> -	}
> +	data.iommu = iommu;
> +	bridge = xx;
> +	pci_for_each_requester_id(bridge, pdev, detach_requester, &data);
>  }
>  
>  static void domain_remove_one_dev_info(struct dmar_domain *domain,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux