Re: [PATCH] PCI: tegra: limit MSI target address to 32-bit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 10, 2017 at 12:04:21PM +0000, Robin Murphy wrote:
> On 10/11/17 11:22, Lorenzo Pieralisi wrote:
> > [+Robin]
> > 
> > On Thu, Nov 09, 2017 at 12:14:35PM -0600, Bjorn Helgaas wrote:
> > > [+cc Lorenzo]
> > > 
> > > On Thu, Nov 09, 2017 at 12:48:14PM +0530, Vidya Sagar wrote:
> > > > On Thursday 09 November 2017 02:55 AM, Bjorn Helgaas wrote:
> > > > > On Mon, Nov 06, 2017 at 11:33:07PM +0530, Vidya Sagar wrote:
> > > > > > limits MSI target address to only 32-bit region to enable
> > > > > > some of the PCIe end points where only 32-bit MSIs
> > > > > > are supported work properly.
> > > > > > One example being Marvel SATA controller
> > > > > > 
> > > > > > Signed-off-by: Vidya Sagar <vidyas@xxxxxxxxxx>
> > > > > > ---
> > > > > >   drivers/pci/host/pci-tegra.c | 2 +-
> > > > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
> > > > > > index 1987fec1f126..03d3dcdd06c2 100644
> > > > > > --- a/drivers/pci/host/pci-tegra.c
> > > > > > +++ b/drivers/pci/host/pci-tegra.c
> > > > > > @@ -1531,7 +1531,7 @@ static int tegra_pcie_enable_msi(struct tegra_pcie *pcie)
> > > > > >   	}
> > > > > >   	/* setup AFI/FPCI range */
> > > > > > -	msi->pages = __get_free_pages(GFP_KERNEL, 0);
> > > > > > +	msi->pages = __get_free_pages(GFP_DMA, 0);
> > > > > >   	msi->phys = virt_to_phys((void *)msi->pages);
> > > > 
> > > > > Should this be GFP_DMA32?  See the comment above the GFP_DMA
> > > > > definition.
> > > > 
> > > > looking at the comments for both GFP_DMA32 and GFP_DMA, I thought GFP_DMA32
> > > > is the correct one to use, but, even with that I got >32-bit addresses.
> > > > GFP_DMA always gives addresses in <4GB boundary (i.e. 32-bit).
> > > > I didn't dig into it to find out why is this the case.
> > > 
> > > This sounds worth looking into (but maybe we don't need the
> > > __get_free_pages() at all; see below).  Maybe there's some underlying
> > > bug.  My laptop shows this, which looks like it might be related:
> > > 
> > >    Zone ranges:
> > >      DMA      [mem 0x0000000000001000-0x0000000000ffffff]
> > >      DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
> > >      Normal   [mem 0x0000000100000000-0x00000004217fffff]
> > >      Device   empty
> > > 
> > > What does your machine show?
> 
> The great thing about ZONE_DMA is that it has completely different meanings
> per platform. ZONE_DMA32 is even worse as it's more or less just an x86
> thing, and isn't implemented (thus does nothing) on most other
> architectures, including ARM/arm64 where Tegra is relevant.
> 
> Fun.
> 
> > > > > Should we be using virt_to_phys() here?  Where exactly is the
> > > > > result ("msi->phys") used, i.e., what bus will that address appear
> > > > > on?  If it appears on the PCI side, this should probably use
> > > > > something like pcibios_resource_to_bus().
> > 
> > I had a quick chat with Robin (CC'ed, who has dealt/is dealing with most
> > of the work this thread relates to) and I think that as things stand,
> > for MSI physical addresses, an offset between PCI->host address shift is
> > not really contemplated (ie 1:1 host<->pci is assumed).
> 
> I think the most correct way to generate an address would actually be via
> the DMA mapping API (i.e. dma_map_page() with the host bridge's PCI device),
> which would take any relevant offsets into account. It's a bit funky, but
> there's some method in the madness.
> 
> > > > This address is written to two places.  First, into host's internal
> > > > register to let it know that when an incoming memory write comes
> > > > with this address, raise an MSI interrupt instead of forwarding it
> > > > to memory subsystem.  Second, into 'Message Address' field of
> > > > 'Message Address Register for MSI' register in end point's
> > > > configuration space (this is done by MSI framework) for end point to
> > > > know which address to be used to generate MSI interrupt.
> > > 
> > > Hmmm, ISTR some past discussion about this.  Here's a little: [1, 2].
> > > And this commit [3] sounds like it describes a similar hardware
> > > situation with Tegra where the host bridge intercepts the MSI target
> > > address, so writes to it never reach system memory.  That means that
> > > Tegra doesn't need to allocate system memory at all.
> > > 
> > > Is your system similar?  Can you just statically allocate a little bus
> > > address space, use that for the MSI target address, and skip the
> > > __get_free_pages()?
> > 
> > IIUC, all these host bridges need is a physical address that is routed
> > upstream to the host bridge by the PCIe tree (ie it is not part of the
> > host bridge windows), as long as the host bridge intercepts it (and
> > endpoints do _not_ reuse it for something else) that should be fine, as
> > it has been pointed out in this thread, allocating a page is a solution
> > - there may be others (which are likely to be platform specific).
> > 
> > > > > Do rcar_pcie_enable_msi() and xilinx_pcie_enable_msi() have a
> > > > > similar problem?  They both use GFP_KERNEL, then virt_to_phys(),
> > > > > then write the result of virt_to_phys() using a 32-bit register
> > > > > write.
> > > > 
> > > > Well, if those systems deal with 64-bit addresses and when an end
> > > > point is connected which supports only 32-bit MSI addresses, this
> > > > problem will surface when __get_free_pages() returns an address that
> > > > translates to a >32-bit address after virt_to_phys() call on it.
> > > 
> > > I'd like to hear from the R-Car and Xilinx folks about (1) whether
> > > there's a potential issue with truncating a 64-bit address, and
> > > (2) whether that hardware works like Tegra, where the MSI write never
> > > reaches memory so we don't actually need to allocate a page.
> > > 
> > > If all we need is to allocate a little bus address space for the MSI
> > > target, I'd like to figure out a better way to do that than
> > > __get_free_pages().  The current code seems a little buggy, and
> > > it's getting propagated through several drivers.
> 
> The really neat version is to take a known non-memory physical address like
> the host controller's own MMIO region, which has no legitimate reason to
> ever be used as a DMA address. pcie-mediatek almost gets this right, but by
> using virt_to_phys() on an ioremapped address they end up with nonsense
> rather than the correct address (although realistically you would have to be
> extremely unlucky for said nonsense to collide with a real DMA address given
> to a PCI endpoint later). Following on from above, dma_map_resource() would
> be the foolproof way to get that right.

Yes, that was our intention as well. Our initial plan was to use an
address from the PCI aperture within Tegra that wasn't being used for
any other purpose. However, we ran into some odd corner cases where this
wasn't working as expected. As a temporary solution we wanted to move to
GFP_DMA32 (or GFP_DMA) in order to support 32-bit only MSI endpoints.

Eventually we'll want to get rid of the allocation altogether, we just
need to find a set of values that work reliably. In the meantime, it
looks as though GFP_DMA would be the right solution as long as we have
to stick with __get_free_pages().

Alternatively, since we've already verified that the MSI writes are
never committed to memory, we could choose some random address pointing
to system memory as well, but I'm reluctant to do that because it could
end up being confusing for users (and developers) to see some random
address showing up somewhere. A physical address such as the beginning
of system memory should always work and might be unique enough to
indicate that it is special.

Thierry

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux