Re: [PATCH v7] PCI: Try best to allocate pref mmio 64bit above 4g

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2014-04-15 at 18:09 -0600, Bjorn Helgaas wrote:
> 
> Thanks for the example.  Please open a bug report at
> http://bugzilla.kernel.org and attach the complete dmesg logs before
> and after Yinghai's patch.
> 
> Having the complete logs helps me answer questions myself without
> having to bother you, and it also helps me figure out whether we can
> improve our logging to make it easier to diagnose problems like this.

Unfortunately, for a *little while* longer (hint !) we can't publish
a complete log from a Power8 machine, but we should be able to include
everything remotely related to PCI.

> > | pci 0003:05:00.0: reg 0x10: [mem 0x3d05801000000-0x3d058010fffff 64bit]
> > | pci 0003:05:00.0: reg 0x18: [mem 0x3d05010000000-0x3d05017ffffff 64bit pref]
> > | pci 0003:05:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
> > | pci 0003:05:00.0: reg 0x134: [mem 0x3d05018000000-0x3d0501fffffff 64bit pref]
> >
> > This is printed at enumeration phase. This device has a SRIOV BAR with
> > size of 0x7ffffff (128M). That's the size of a signle VF BAR. The device
> > supports 63 VFs so we need near 8G space in total. Apparanlty we need
> > exploit 64-bit space.
> 
> Yes.  Do we print a hint anywhere about how many VFs there are?  In
> other words, can you deduce the number "63" from the dmesg, or do you
> have to figure that out some other way?  It'd be nice if that
> information were somewhere in dmesg.
>
> > | PCI host bridge to bus 0003:00
> > | pci_bus 0003:00: root bus resource [mem 0x3d05800000000-0x3d0587ffeffff] (bus address [0x80000000-0xfffeffff])
> > | pci_bus 0003:00: root bus resource [mem 0x3d05008000000-0x3d057ffffffff 64bit pref]
> >
> > And we do have a huge (32G) 64-bit prefetchable window supply. We expect
> > everything to work fine, but:
> >
> > | pci 0003:00:00.0: BAR 15: can't assign mem pref (size 0x206000000)
> > | pci 0003:00:00.0: BAR 14: assigned [mem 0x3d05800000000-0x3d05802ffffff]
> > | pci 0003:00:00.0: BAR 13: can't assign io (size 0x4000)
> >
> > It went wrong at the beginning. Note the error message never considers
> > 64-bit or not, but BAR 15 here has it MEM_64 flag cleared.
> 
> BAR 15 is a bridge window.  I think its resource flags should reflect
> the capability of the *window*, even if we disable the window or we
> happen to assign addresses that are under 4GB.  So I think it's wrong
> that we clear the MEM_64 flag  in pbus_size_mem() and the IO flag in
> pbus_size_io().
> 
> > It first
> > tried to find a 32-bit prefetchable window, but we only supply a 64-bit one.
> > So it fall back to (32-bit) non-prefetchable window, but there is no enough
> > room there. At last it went into complicated steps (not show here) of
> > allocating requested resource first, then try best for the optional ones, etc..
> >
> > Why is BAR 15 (prefetchable) 32 bit instead of 64? Because PCI core favours
> > 32-bit prefetchable BARs and we have some. This is one of them:
> >
> > | pci 0003:05:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
> >
> > PCI core decides to let them enjoy the benefition of prefetch. They can't
> > bear the risk of getting 4G-above address, so its parent, its parent's parent,
> > its parent's parent's parent, finally the root bridge (00:00.0) must have their
> > MEM_64 flag of prefetchable resource (BAR 15) clear.
> 
> It sounds like we're tracking the resource requirements
> (prefetchability and BAR width) by using the flags on bridge windows.
> If that's the case, I think it's wrong.  We should preserve the bridge
> window flags, because those express the bridge hardware capabilities,
> and we should explicitly keep track of what's required by devices
> below the bridge in some other way.
> 
> > In the end nobody
> > is eligible to use the 64-bit (prefetchable) space even we have huge
> > supply !
> >
> > Note even the resource is small and successfully fall back into 32-bit
> > non-prefetchable window, that's still not OK for us because we need
> > SRIOV resource be at 64-bit prefetchable space to do platform
> > configuration.
> >
> > With Yinghai's patch, when 64-bit prefetchable BARs found, they're more
> > favoured than the 32-bit prefetchable ones (if any), so all upstream bridges'
> > prefetchable windows have their MEM_64 flag reserved and the huge 64-bit
> > prefetchable space will be exploited:
> >
> > | pci 0003:00:00.0: BAR 15: assigned [mem 0x3d05008000000-0x3d0521fffffff 64bit pref]
> > | pci 0003:00:00.0: BAR 14: assigned [mem 0x3d05800000000-0x3d05802ffffff]
> > | pci 0003:00:00.0: BAR 13: can't assign io (size 0x4000)
> >
> > (The IO resource error here is due to we do not provide IO window)
> 
> Yes.  The lack of I/O space is just a constraint of the platform.
> It'd be nice if we printed a more meaningful error message in this
> case.  One really has to be a PCI expert to distinguish this from a
> real problem that we need to fix.
> 
> Bjorn


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux