Re: [PATCH v7] PCI: Try best to allocate pref mmio 64bit above 4g

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 10, 2014 at 11:26:14AM -0600, Bjorn Helgaas wrote:
> On Wed, Apr 9, 2014 at 1:52 AM, Guo Chao <yan@xxxxxxxxxxxxxxxxxx> wrote:
> > So 4G-above space can be properly used.
> >
> > Why does enabling 64-bit space matter?
> >
> > * 32-bit space is just too small. If larger-than-4G BAR sounds
> >   unrealistic (we do have such devices), there is still a chance that
> >   total MMIO size under a domain is too large, especially when SR-IOV
> >   enabled (we met this situation).
> 
> Please give more details about the problem you saw.  A complete dmesg
> log showing a boot failure or a device that doesn't work, and another
> log with this patch applied, showing things working as desired, would
> go a long ways toward clarifying this.
> 
> This sounds like a specific case that will be fixed by the patch, and
> if you give more details, maybe I can figure out what's going on.
> 

Let's see an example.

| pci 0003:05:00.0: reg 0x10: [mem 0x3d05801000000-0x3d058010fffff 64bit]
| pci 0003:05:00.0: reg 0x18: [mem 0x3d05010000000-0x3d05017ffffff 64bit pref]
| pci 0003:05:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
| pci 0003:05:00.0: reg 0x134: [mem 0x3d05018000000-0x3d0501fffffff 64bit pref]

This is printed at enumeration phase. This device has a SRIOV BAR with
size of 0x7ffffff (128M). That's the size of a signle VF BAR. The device
supports 63 VFs so we need near 8G space in total. Apparanlty we need
exploit 64-bit space.

| PCI host bridge to bus 0003:00
| pci_bus 0003:00: root bus resource [mem 0x3d05800000000-0x3d0587ffeffff] (bus address [0x80000000-0xfffeffff])
| pci_bus 0003:00: root bus resource [mem 0x3d05008000000-0x3d057ffffffff 64bit pref]

And we do have a huge (32G) 64-bit prefetchable window supply. We expect
everything to work fine, but:

| pci 0003:00:00.0: BAR 15: can't assign mem pref (size 0x206000000)
| pci 0003:00:00.0: BAR 14: assigned [mem 0x3d05800000000-0x3d05802ffffff]
| pci 0003:00:00.0: BAR 13: can't assign io (size 0x4000)

It went wrong at the beginning. Note the error message never considers
64-bit or not, but BAR 15 here has it MEM_64 flag cleared. It first
tried to find a 32-bit prefetchable window, but we only supply a 64-bit one.
So it fall back to (32-bit) non-prefetchable window, but there is no enough
room there. At last it went into complicated steps (not show here) of
allocating requested resource first, then try best for the optional ones, etc..

Why is BAR 15 (prefetchable) 32 bit instead of 64? Because PCI core favours
32-bit prefetchable BARs and we have some. This is one of them:

| pci 0003:05:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]

PCI core decides to let them enjoy the benefition of prefetch. They can't
bear the risk of getting 4G-above address, so its parent, its parent's parent,
its parent's parent's parent, finally the root bridge (00:00.0) must have their
MEM_64 flag of prefetchable resource (BAR 15) clear. In the end nobody
is eligible to use the 64-bit (prefetchable) space even we have huge
supply !

Note even the resource is small and successfully fall back into 32-bit
non-prefetchable window, that's still not OK for us because we need
SRIOV resource be at 64-bit prefetchable space to do platform
configuration.

With Yinghai's patch, when 64-bit prefetchable BARs found, they're more
favoured than the 32-bit prefetchable ones (if any), so all upstream bridges'
prefetchable windows have their MEM_64 flag reserved and the huge 64-bit
prefetchable space will be exploited:

| pci 0003:00:00.0: BAR 15: assigned [mem 0x3d05008000000-0x3d0521fffffff 64bit pref]
| pci 0003:00:00.0: BAR 14: assigned [mem 0x3d05800000000-0x3d05802ffffff]
| pci 0003:00:00.0: BAR 13: can't assign io (size 0x4000)

(The IO resource error here is due to we do not provide IO window)

Thanks,
Guo Chao


> Bjorn
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux