Re: Partial BAR Address Allocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Robin,

On 2/23/2017 6:40 AM, Robin Murphy wrote:
> On 22/02/17 23:39, Bjorn Helgaas wrote:
>> [+cc Joerg, iommu list]
>>
>> On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote:
>>> On 2/22/2017 1:44 PM, Bjorn Helgaas wrote:
>>>> There is no way for a driver to say "I only need this memory BAR and
>>>> not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
>>>> enables *all* the memory BARs; there's no way to enable memory BARs
>>>> selectively.  If we enable memory BARs and one of them is unassigned,
>>>> that unassigned BAR is enabled, and the device will respond at
>>>> whatever address the register happens to contain, and that may cause
>>>> conflicts.
>>>>
>>>> I'm not sure this answers your question.  Do you want to get rid of
>>>> 32-bit BAR addresses because your host bridge doesn't have a window to
>>>> 32-bit PCI addresses?  It's typical for a bridge to support a window
>>>> to the 32-bit PCI space as well as one to the 64-bit PCI space.  Often
>>>> it performs address translation for the 32-bit window so it doesn't
>>>> have to be in the 32-bit area on the CPU side, e.g., you could have
>>>> something like this where we have three host bridges and the 2-4GB
>>>> space on each PCI root bus is addressable:
>>>>
>>>>   pci_bus 0000:00: root bus resource [mem 0x1080000000-0x10ffffffff] (bus address [0x80000000-0xffffffff])
>>>>   pci_bus 0001:00: root bus resource [mem 0x1180000000-0x11ffffffff] (bus address [0x80000000-0xffffffff])
>>>>   pci_bus 0002:00: root bus resource [mem 0x1280000000-0x12ffffffff] (bus address [0x80000000-0xffffffff])
>>>
>>> The problem is that according to PCI specification BAR addresses and
>>> DMA addresses cannot overlap.
>>>
>>> From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory
>>> transactions from its primary interface to its secondary interface
>>> (downstream) if a memory address is in the range defined by the
>>> Memory Base and Memory Limit registers (when the base is less than
>>> or equal to the limit) as illustrated in Figure 4-3. Conversely, a
>>> memory transaction on the secondary interface that is within this
>>> address range will not be forwarded upstream to the primary
>>> interface."
>>>
>>> To be specific, if your DMA address happens to be in
>>> [0x80000000-0xffffffff] and root port's aperture includes this
>>> range; the DMA will never make to the system memory.
>>>
>>> Lorenzo and Robin took some steps to carve out PCI addresses out of
>>> DMA addresses in IOMMU drivers by using iova_reserve_pci_windows()
>>> function.
>>>
>>> However, I see that we are still exposed when the operating system
>>> doesn't have any IOMMU driver and is using the SWIOTLB for instance. 
>>
>> Hmmm.  I guess SWIOTLB assumes there's no address translation in the
>> DMA direction, right?
> 
> Not entirely - it does rely on arch-provided dma_to_phys() and
> phys_to_dma() helpers which are free to accommodate such translations in
> a device-specific manner. On arm64 we use these to account for
> dev->dma_pfn_offset describing a straightforward linear offset, but
> unless one constant offset would apply to all possible outbound windows
> I'm not sure that's much help here.

yeah, that won't help. This is a PCI only problem. Arch layer solution
will move the entire DMA ranges for all peripherals in the SOC to a specific offset.
This would be most useful if the entire DDR would start at some non-zero offset.

Even then, PCI usually has several ranges. One range like this to have some
space below 4GB and another untranslated range for true 64bit cards. 

>>>>   pci_bus 0002:00: root bus resource [mem 0x1280000000-0x12ffffffff] (bus address [0x80000000-0xffffffff])

We have to emulate some range in the first 4GB to make PCI cards happy.

> 
>>  If there's no address translation in the PIO
>> direction, PCI bus BAR addresses are identical to the CPU-side
>> addresses.  In that case, there's no conflict because we already have
>> to assign BARs so they never look like a system memory address.
>>
>> But if there *is* address translation in the PIO direction, we can
>> have conflicts because the bridge can translate CPU-side PIO accesses
>> to arbitrary PCI bus addresses.
>>
>>> The FW solution I'm looking at requires carving out some part of the
>>> DDR from before OS boot so that OS doesn't reclaim that area for
>>> DMA.
>>
>> If you want to reach system RAM, I guess you need to make sure you
>> only DMA to bus addresses outside the host bridge windows, as you said
>> above.  DMA inside the windows would be handled as peer-to-peer DMA.
>>
>>> I'm not very happy with this solution. I'm also surprised that there
>>> is no generic solution in the kernel takes care of this for all root
>>> ports regardless of IOMMU driver presence.
>>
>> The PCI core isn't really involved in allocating DMA addresses,
>> although there definitely is the connection with PCI-to-PCI bridge
>> windows that you mentioned.  I added IOMMU guys, who would know a lot
>> more than I do.
> 
> To me, having the bus addresses of windows shadow assigned physical
> addresses sounds mostly like a broken system configuration. Can the
> firmware not reprogram them elsewhere, or is the entire bottom 4GB of
> the physical memory map occupied by system RAM?

I think your suggestion is also going in the same direction where FW 
moves the things around so that there is some hole in the first 4GB 
that OS doesn't see it and PCI has exclusive access to it.

I was looking to see if there is a better solution via some ACPI
table entry like PNP0C02 to tell the OS what range PCI drivers are not
allowed to touch but could be used for something else.

Problem with UEFI reserved region is that we are prohibiting the 
region from being used for anything else besides PCI. That region
is gone forever.

Another solution like you suggested is to move the DDR around so that
I don't need reserved regions.

Thanks for the suggestions,
Sinan

> 
> Robin.
> 
>>
>> Bjorn
>> _______________________________________________
>> iommu mailing list
>> iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>>
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux