Re: PCI hotplug problems: how to debug?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 16, 2009 at 11:03:17AM -0800, Yinghai Lu wrote:
> On Mon, Nov 16, 2009 at 10:51 AM, Ira W. Snyder <iws@xxxxxxxxxxxxxxxx> wrote:
> > On Mon, Nov 16, 2009 at 11:32:43AM -0600, Bjorn Helgaas wrote:
> >> On Monday 16 November 2009 09:48:04 am Ira W. Snyder wrote:
> >> > On Fri, Nov 13, 2009 at 06:26:52PM -0600, Bjorn Helgaas wrote:
> >>
> >> > > If you post what you've got so far and the dmesg log, I can try to
> >> > > help debug it.
> >> > >
> >> >
> >> > Here it is, both patch and dmesg inlined. The dmesg log just keeps going
> >> > forever after the "--- [ cut here ] ---" line.
> >> >
> >> > I've do not have a board plugged in behind the bridge at the moment. The
> >> > only devices are the onboard ethernet, vga, etc. That's why the second
> >> > host bridge has no memory behind it.
> >>
> >> The preemption imbalance concerns me.  If this patch causes that,
> >> I'm worried that directly updating bus->resource[n] is corrupting
> >> something.
> >>
> >
> > Yes, I wonder if that is the case. The pointers are not NULL, but are
> > they valid? How would I know?
> >
> >> Why didn't we find an I/O port window?  The BIOS configured I/O ports
> >> for devices, so there must be a window.  You found earlier that the
> >> I/O port aperture seemed to be described at 0xd0, so I'm curious why
> >> we didn't find it this time.
> >>
> >
> > I'm not sure, I added some extra debugging code to the patch, and it
> > seems that resources don't work for IO ports. See the relevant output
> > below. Things still crashed in exactly the same way.
> >
> >> There are two host bridges here, but the second has no devices below it.
> >> I suppose it's possible there's some sort of "transparent" mode where
> >> the first bridge forwards anything it sees, so the window description
> >> at 0xd0 doesn't really matter.  That would mean we couldn't support hot-
> >> adding devices under the second bridge unless we know how to program
> >> a real aperture in the first bridge and turn off that transparent mode.
> >> But this is just speculation.
> >>
> >
> > I unplugged all devices from my crate, which is why there is nothing
> > behind the second bridge. I can get a test device if necessary. Until
> > the kernel doesn't crash, this seems unnecessary.
> >
> >> The two host bridges should have different buses below them.  Both of
> >> your bridges want to update resources for the bus at 0xf7883800, so
> >> that's not going to work.  If you arrange to have x86_pci_root_bus_res_quirks()
> >> called, do you see it called twice, with a different pci_bus each time?
> >>
> >
> > Should I add this call at the end of my quirk function? I'm very
> > unfamiliar with the PCI subsystem internals.
> >
> > Here is the revised output from the revised patch below. Note that even
> > though I fill in the IO resource with the IO ports, it gets printed as
> > NULL. Why would that be?
> >
> > [    0.304015] pci 0000:00:00.0: calling cnb20le_res+0x0/0x371
> > [    0.308014] pci 0000:00:00.0: CNB20LE: busses: 0 to 0
> > [    0.312014] pci 0000:00:00.0: CNB20LE: noPF 0xfc20 0xfeaf
> > [    0.316013] pci 0000:00:00.0: CNB20LE: noPF [mem 0xfc200000-0xfeafffff]
> > [    0.320014] pci 0000:00:00.0: CNB20LE: PF 0xfc00 0xfc0f
> > [    0.324012] pci 0000:00:00.0: CNB20LE: PF [mem 0xfc000000-0xfc0fffff pref]
> > [    0.328014] pci 0000:00:00.0: CNB20LE: IO 0xd000 0xdffc
> > [    0.332010] pci 0000:00:00.0: CNB20LE: IO (null)
> > [    0.336011] pci 0000:00:00.0: CNB20LE: parent bus: f7885800 number 0 pri 0 sec 0
> > [    0.340011] pci 0000:00:00.0: CNB20LE: parent res0: [mem 0xfc200000-0xfeafffff]
> > [    0.344011] pci 0000:00:00.0: CNB20LE: parent res1: [mem 0xfc000000-0xfc0fffff pref]
> > [    0.348010] pci 0000:00:00.0: CNB20LE: parent res2: (null)
> > [    0.352010] pci 0000:00:00.0: CNB20LE: parent res3: (null)
> > [    0.356009] pci 0000:00:00.0: CNB20LE: parent res4: (null)
> > [    0.360009] pci 0000:00:00.0: CNB20LE: parent res5: (null)
> > [    0.364074] pci 0000:00:00.0: calling quirk_resource_alignment+0x0/0x164
> > [    0.368090] pci 0000:00:00.1: found [1166:0009] class 000600 header type 00
> > [    0.372014] pci 0000:00:00.1: calling quirk_no_ata_d3+0x0/0x24
> > [    0.376013] pci 0000:00:00.1: calling acpi_pm_check_graylist+0x0/0x2d
> > [    0.380009] * The chipset may have PM-Timer Bug. Due to workarounds for a bug,
> > [    0.380014] * this clock source is slow. If you are sure your timer does not have
> > [    0.380018] * this bug, please use "acpi_pm_good" to disable the workaround
> > [    0.384013] pci 0000:00:00.1: calling cnb20le_res+0x0/0x371
> > [    0.388014] pci 0000:00:00.1: CNB20LE: busses: 1 to 1
> > [    0.392014] pci 0000:00:00.1: CNB20LE: noPF 0xfeb0 0xfebf
> > [    0.396010] pci 0000:00:00.1: CNB20LE: noPF (null)
> > [    0.400014] pci 0000:00:00.1: CNB20LE: PF 0xfc10 0xfc1f
> > [    0.404009] pci 0000:00:00.1: CNB20LE: PF (null)
> > [    0.408014] pci 0000:00:00.1: CNB20LE: IO 0xe000 0xeffc
> > [    0.412009] pci 0000:00:00.1: CNB20LE: IO (null)
> > [    0.416011] pci 0000:00:00.1: CNB20LE: parent bus: f7885800 number 0 pri 0 sec 0
> > [    0.420011] pci 0000:00:00.1: CNB20LE: parent res0: [mem 0xfc200000-0xfeafffff]
> > [    0.424011] pci 0000:00:00.1: CNB20LE: parent res1: [mem 0xfc000000-0xfc0fffff pref]
> > [    0.428010] pci 0000:00:00.1: CNB20LE: parent res2: (null)
> > [    0.432010] pci 0000:00:00.1: CNB20LE: parent res3: (null)
> > [    0.436009] pci 0000:00:00.1: CNB20LE: parent res4: (null)
> > [    0.440009] pci 0000:00:00.1: CNB20LE: parent res5: (null)
> >
> > And the revised patch, with debugging output added:
> >
> > From 8fbbf7f99e58a1c95580783a1ba7dd6ebdd45187 Mon Sep 17 00:00:00 2001
> > From: Ira W. Snyder <iws@xxxxxxxxxxxxxxxx>
> > Date: Mon, 16 Nov 2009 08:42:39 -0800
> > Subject: [PATCH] PCI: read memory ranges out of Broadcom CNB20LE host bridge
> >
> > Read the memory ranges behind the Broadcom CNB20LE host bridge out of the
> > hardware. This allows PCI hotplugging to work, since we know which memory
> > range to allocate PCI BAR's from.
> >
> > Signed-off-by: Ira W. Snyder <iws@xxxxxxxxxxxxxxxx>
> > ---
> >  arch/x86/pci/Makefile       |    1 +
> >  arch/x86/pci/broadcom_bus.c |   75 +++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 76 insertions(+), 0 deletions(-)
> >  create mode 100644 arch/x86/pci/broadcom_bus.c
> >
> > diff --git a/arch/x86/pci/Makefile b/arch/x86/pci/Makefile
> > index d8a0a62..f762c05 100644
> > --- a/arch/x86/pci/Makefile
> > +++ b/arch/x86/pci/Makefile
> > @@ -15,6 +15,7 @@ obj-$(CONFIG_X86_NUMAQ)               += numaq_32.o
> >
> >  obj-y                          += common.o early.o
> >  obj-y                          += amd_bus.o
> > +obj-y                          += broadcom_bus.o
> >  obj-$(CONFIG_X86_64)           += intel_bus.o
> >
> >  ifeq ($(CONFIG_PCI_DEBUG),y)
> > diff --git a/arch/x86/pci/broadcom_bus.c b/arch/x86/pci/broadcom_bus.c
> > new file mode 100644
> > index 0000000..5d56e23
> > --- /dev/null
> > +++ b/arch/x86/pci/broadcom_bus.c
> > @@ -0,0 +1,75 @@
> > +/*
> > + * Read address ranges from a Broadcom CNB20LE Host Bridge
> > + *
> > + * Copyright (c) 2009 Ira W. Snyder <iws@xxxxxxxxxxxxxxxx>
> > + *
> > + * This file is licensed under the terms of the GNU General Public License
> > + * version 2. This program is licensed "as is" without any warranty of any
> > + * kind, whether express or implied.
> > + */
> > +
> > +#define DEBUG 1
> > +
> > +#include <linux/delay.h>
> > +#include <linux/dmi.h>
> > +#include <linux/pci.h>
> > +#include <linux/init.h>
> > +#include <asm/pci_x86.h>
> > +
> > +#include "bus_numa.h"
> > +
> > +static int res_num = 0;
> 
> why?
> 

So I know how many resources I've consumed so far. The same dev->bus
structure is given to me for both host bridges. I wanted to keep track
of which dev->bus->resource[i] files I'd filled in so far.

Should I try and figure out which ones are not used so far?

> > +
> > +static void __devinit cnb20le_res(struct pci_dev *dev)
> > +{
> > +       struct pci_bus *bus = dev->bus;
> > +       u16 word1, word2;
> > +       u8 fbus, lbus;
> > +
> > +       pci_read_config_byte(dev, 0x44, &fbus);
> > +       pci_read_config_byte(dev, 0x45, &lbus);
> > +       dev_dbg(&dev->dev, "CNB20LE: busses: %d to %d\n", fbus, lbus);
> > +
> > +       pci_read_config_word(dev, 0xc0, &word1);
> > +       pci_read_config_word(dev, 0xc2, &word2);
> > +       dev_dbg(&dev->dev, "CNB20LE: noPF 0x%.4x 0x%.4x\n", word1, word2);
> > +       if (word1 != word2) {
> > +               bus->resource[res_num]->start = (word1 << 16) | 0x0000;
> > +               bus->resource[res_num]->end = (word2 << 16) | 0xffff;
> > +               bus->resource[res_num]->flags = IORESOURCE_MEM;
> > +               dev_dbg(&dev->dev, "CNB20LE: noPF %pR\n", bus->resource[res_num]);
> > +               res_num++;
> > +       }
> > +
> > +       pci_read_config_word(dev, 0xc4, &word1);
> > +       pci_read_config_word(dev, 0xc6, &word2);
> > +       dev_dbg(&dev->dev, "CNB20LE: PF 0x%.4x 0x%.4x\n", word1, word2);
> > +       if (word1 != word2) {
> > +               bus->resource[res_num]->start = (word1 << 16) | 0x0000;
> > +               bus->resource[res_num]->end = (word2 << 16) | 0xffff;
> > +               bus->resource[res_num]->flags = IORESOURCE_MEM | IORESOURCE_PREFETCH;
> > +               dev_dbg(&dev->dev, "CNB20LE: PF %pR\n", bus->resource[res_num]);
> > +               res_num++;
> > +       }
> > +
> > +       pci_read_config_word(dev, 0xd0, &word1);
> > +       pci_read_config_word(dev, 0xd2, &word2);
> > +       dev_dbg(&dev->dev, "CNB20LE: IO 0x%.4x 0x%.4x\n", word1, word2);
> > +       if (word1 != word2) {
> > +               bus->resource[res_num]->start = word1;
> > +               bus->resource[res_num]->end = word2;
> > +               bus->resource[res_num]->flags = IORESOURCE_IO;
> > +               dev_dbg(&dev->dev, "CNB20LE: IO %pR\n", bus->resource[res_num]);
> > +               res_num++;
> > +       }
> > +
> > +       dev_dbg(&dev->dev, "CNB20LE: parent bus: %p number %d pri %d sec %d\n", bus, bus->number, bus->primary, bus->secondary);
> > +       dev_dbg(&dev->dev, "CNB20LE: parent res0: %pR\n", dev->bus->resource[0]);
> > +       dev_dbg(&dev->dev, "CNB20LE: parent res1: %pR\n", dev->bus->resource[1]);
> > +       dev_dbg(&dev->dev, "CNB20LE: parent res2: %pR\n", dev->bus->resource[2]);
> > +       dev_dbg(&dev->dev, "CNB20LE: parent res3: %pR\n", dev->bus->resource[3]);
> > +       dev_dbg(&dev->dev, "CNB20LE: parent res4: %pR\n", dev->bus->resource[4]);
> > +       dev_dbg(&dev->dev, "CNB20LE: parent res5: %pR\n", dev->bus->resource[5]);
> > +}
> > +
> > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_LE, cnb20le_res);
> > --
> > 1.5.4.3
> >
> > Thanks, Ira
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux