Re: [PATCH 1/4] Intel pci: Remove Host Bridge devices from identity mapping

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Mike Travis (travis@xxxxxxx) wrote:
> Chris Wright wrote:
> >* Mike Travis (travis@xxxxxxx) wrote:
> >>    When the IOMMU is being used, each request for a DMA mapping requires
> >>    the intel_iommu code to look for some space in the DMA mapping table.
> >>    For most drivers this occurs for each transfer.
> >>
> >>    When there are many outstanding DMA mappings [as seems to be the case
> >>    with the 10GigE driver], the table grows large and the search for
> >>    space becomes increasingly time consuming.  Performance for the
> >>    10GigE driver drops to about 10% of it's capacity on a UV system
> >>    when the CPU count is large.
> >
> >That's pretty poor.  I've seen large overheads, but when that big it was
> >also related to issues in the 10G driver.  Do you have profile data
> >showing this as the hotspot?
> 
> Here's one from our internal bug report:
> 
> Here is a profile from a run with iommu=on  iommu=pt  (no forcedac)

OK, I was actually interested in the !pt case.  But this is useful
still.  The iova lookup being distinct from the identity_mapping() case.

> uv48-sys was receiving and uv-debug sending.
> ksoftirqd/640 was running at approx. 100% cpu utilization.
> I had pinned the nttcp process on uv48-sys to cpu 64.
> 
> # Samples: 1255641
> #
> # Overhead        Command  Shared Object  Symbol
> # ........  .............  .............  ......
> #
>    50.27%ESC[m  ksoftirqd/640  [kernel]       [k] _spin_lock
>    27.43%ESC[m  ksoftirqd/640  [kernel]       [k] iommu_no_mapping

> ...
>      0.48%  ksoftirqd/640  [kernel]       [k] iommu_should_identity_map
>      0.45%  ksoftirqd/640  [kernel]       [k] ixgbe_alloc_rx_buffers    [
> ixgbe]

Note, ixgbe has had rx dma mapping issues (that's why I wondered what
was causing the massive slowdown under !pt mode).

<snip>
> I tracked this time down to identity_mapping() in this loop:
> 
>       list_for_each_entry(info, &si_domain->devices, link)
>               if (info->dev == pdev)
>                       return 1;
> 
> I didn't get the exact count, but there was approx 11,000 PCI devices
> on this system.  And this function was called for every page request
> in each DMA request.

Right, so this is the list traversal (and wow, a lot of PCI devices).
Did you try a smarter data structure? (While there's room for another
bit in pci_dev, the bit is more about iommu implementation details than
anything at the pci level).

Or the domain_dev_info is cached in the archdata of device struct.
You should be able to just reference that directly.

Didn't think it through completely, but perhaps something as simple as:

	return pdev->dev.archdata.iommu == si_domain;

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux