On Tue, Mar 18, 2014 at 8:27 AM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote: > On Fri, Mar 14, 2014 at 09:48:35AM +0800, Ming Lei wrote: >> On Fri, Mar 14, 2014 at 12:08 AM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote: >> > On Thu, Mar 13, 2014 at 2:51 AM, Ming Lei <tom.leiming@xxxxxxxxx> wrote: >> >> Hi Bjorn, >> >> >> >> I found this patch broke virtio-pci devices. >> > >> > Thanks a lot for testing this. >> > >> >> On Thu, Feb 27, 2014 at 3:37 AM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote: >> >>> Don't rely on BAR contents when the command register says the BAR is >> >>> disabled. >> >>> >> >>> If we receive a PCI device from firmware (or a hot-added device that was >> >>> just powered up) with the MEMORY or IO enable bits in the PCI command >> >>> register cleared, there's no reason to believe the BARs contain valid >> >>> addresses. >> >> >> >> From PCI LOCAL BUS SPECIFICATION, REV. 3.0, both >> >> PCI_COMMAND_IO and PCI_COMMAND_MEM should be >> >> cleared after reset, so looks the patch sets IORESOURCE_UNSET >> >> too early because PCI drivers may call pci_enable_device() >> >> (->pci_enable_resources()) to enable the two bits of >> >> PCI_COMMAND explicitly. >> > >> > The point is that it's not safe to enable those two bits unless we're >> > certain that the BARs they control contain valid values that don't >> > conflict with anything else in the system. >> > >> > Maybe we should only set IORESOURCE_UNSET when we find a conflict or a >> > BAR that's not contained by an upstream bridge window, and we should >> > try to reallocate then. I'm pretty sure we do that at least in some >> > cases, but it would probably simplify things if we did it more >> > consistently, and maybe we shouldn't set it at all here in >> > __pci_read_base(). >> >> I think so because __pci_read_base() is called in device emulation >> path. > > Which path is this? I don't know anything about virtio-pci, and I only see > calls to __pci_read_base() from: > > sriov_init() > pci_sriov_resource_alignment() > pci_read_bases() > >> > But I'd like to understand your situation better, so can you provide >> > more details, please? Complete before/after dmesg logs would go a >> > long way toward illustrating the problem you're seeing. >> >> Please see the two attachment log. The memory allocation failure >> is caused by mistaken value read from pci address after the device >> is failed to enable. > > Your logs are harder than necessary to compare because one has a lot more > debug turned on than the other. > > In the failing case, we ignore all the initial BAR values, but we do assign > values to all of them later: > > pci 0000:00:00.0: can't claim BAR 0 [mem size 0x00000400]: no address assigned > pci 0000:00:00.0: can't claim BAR 1 [io size 0x0400]: no address assigned > ... > pci 0000:00:00.0: BAR 0: assigned [mem 0x40000000-0x400003ff] > pci 0000:00:00.0: BAR 1: assigned [io 0x1000-0x13ff] > ... > > The newly-assigned values look valid, and as far as I can tell, they should > work. Do you know why they don't? Is there an assumption somewhere that > we never change BAR values? I don't know the cause, maybe it is related with the hypervisor implementation. > > Even if we don't need to ignore BAR values in as many cases as we do, it > should be legal to ignore them and reassign them, so I want to understand > what's going on here before reverting this. > > Is there an easy way I can reproduce the problem on my own box? It is not quite difficult, you can build a lkvm following the README in below link and test -next tree on the small kvm hypervisor: https://github.com/penberg/linux-kvm/blob/master/tools/kvm/README Thanks, -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html