Re: I/O conflict on Versalogic Tiger (VL-EPM-24)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 6, 2015 at 5:14 PM, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> On Wednesday, May 06, 2015 02:15:15 PM George McCollister wrote:
>> On Wed, May 6, 2015 at 10:51 AM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote:
>> > [+cc Rafael]
>> >
>> > On Wed, May 6, 2015 at 9:47 AM, George McCollister
>> > <george.mccollister@xxxxxxxxx> wrote:
>> >> We're using Versalogic Tiger (VL-EPM-24) SBCs in embedded systems
>> >> running linux 3.2.x without any problems. Recently, when testing the
>> >> latest mainline kernel I found the system hard locked during boot.
>> >>
>> >> After some investigation I noticed that the kernel print time stamps
>> >> were bogus after one of the pcieports was enabled:
>> >> [    1.658879] io scheduler cfq registered (default)
>> >> [    1.663905] pcieport 0000:00:1c.0: enabling device (0004 -> 0007)
>> >> [    6.254134] Serial: 8250/16550 driver, 21 ports, IRQ sharing enabled
>> >> [    6.254134] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200)
>> >> is a 16550A
>> >> [    6.254134] 00:04: ttyS1 at I/O 0x2f8 (irq = 0, base_baud = 115200)
>> >> is a 16550A
>> >> [    6.254134] 00:05: ttyS2 at I/O 0x3e8 (irq = 0, base_baud = 115200)
>> >> is a 16550A
>> >> [    6.254134] 00:06: ttyS4 at I/O 0x238 (irq = 0, base_baud = 115200)
>> >> is a 16550A
>> >>
>> >> I was surprised to find that the problem existed as far back as 3.11.
>> >> I checked to make sure we were using the latest BIOS and contacted the
>> >> vendor to see if they were aware of anyone else using recent versions
>> >> of the linux kernel. They stated that they were unaware of anyone
>> >> using recent kernel versions on this board and tired to convince me to
>> >> stick with an old version.
>> >>
>> >> I then git bisected to this commit:
>> >> ac212b6980d8d5eda705864fc5a8ecddc6d6eacc ACPI / processor: Use common
>> >> hotplug infrastructure
>> >>
>> >> After diffing the kernel output before and after this commit I noticed
>> >> that the I/O BAR assigned to the pcieport (same one as above) changed
>> >> from 0x1000 to 0x2000.
>> >>
>> >> @@ -191,13 +191,13 @@
>> >> Switching to clocksource acpi_pm
>> >> pci 0000:00:1c.0: BAR 9: assigned [mem 0x80000000-0x801fffff pref]
>> >> pci 0000:00:1c.1: BAR 9: assigned [mem 0x80200000-0x803fffff pref]
>> >> -pci 0000:00:1c.0: BAR 7: assigned [io  0x1000-0x1fff]
>> >> +pci 0000:00:1c.0: BAR 7: assigned [io  0x2000-0x2fff]
>> >> pci 0000:02:01.0: PCI bridge to [bus 03]
>> >> pci 0000:02:01.0:   bridge window [mem 0xdff00000-0xdfffffff]
>> >> pci 0000:01:00.0: PCI bridge to [bus 02-03]
>> >> pci 0000:01:00.0:   bridge window [mem 0xdff00000-0xdfffffff]
>> >> pci 0000:00:1c.0: PCI bridge to [bus 01-03]
>> >> -pci 0000:00:1c.0:   bridge window [io  0x1000-0x1fff]
>> >> +pci 0000:00:1c.0:   bridge window [io  0x2000-0x2fff]
>> >> pci 0000:00:1c.0:   bridge window [mem 0xdff00000-0xdfffffff]
>> >> pci 0000:00:1c.0:   bridge window [mem 0x80000000-0x801fffff pref]
>> >> pci 0000:00:1c.1: PCI bridge to [bus 04]
>> >>
>> >> I also noticed the kernel output 'ACPI: PM-Timer IO Port: 0x2008' and
>> >> made the connection that since acpi_pm was being used as the
>> >> clocksource and since the problems started when the BAR switched from
>> >> 0x1000 to 0x2000 an I/O conflict must be the source of the problems.
>> >>
>> >> I did some reading into ACPI (since my understanding of it was novice
>> >> at the time) and dumped the DSDT. I found no reference to anything in
>> >> the 0x2xxx I/O range though I did find the following in the FADT:
>> >> PM1a_EVT_BLK at 0x2000-0x2003
>> >> PM1a_CNT_BLK at 0x2004-0x2005
>> >> PM_TMR at 0x2008-0x200b
>> >>
>> >> I dumped the DSDT on other systems and found that some used PNP0C02 to
>> >> reserve I/O ranges used by the ACPI PM registers.
>> >>
>> >> I added the following to the Versalogic Tiger dsdt.dsl under the PCI
>> >> bus, compiled it and and compiled into the linux kernel:
>> >>             Device (PMIO)
>> >>             {
>> >>                 Name (_HID, EisaId ("PNP0C02") /* PNP Motherboard
>> >> Resources */)  // _HID: Hardware ID
>> >>                 Name (_UID, 0x09)  // _UID: Unique ID
>> >>                 Method (_CRS, 0, NotSerialized)  // _CRS: Current
>> >> Resource Settings
>> >>                 {
>> >>                     Name (BUF0, ResourceTemplate ()
>> >>                     {
>> >>                         IO (Decode16,
>> >>                             0x2000,            // Range Minimum
>> >>                             0x2000,            // Range Maximum
>> >>                             0x01,              // Alignment
>> >>                             0xC,               // Length
>> >>                         )
>> >>                         IO (Decode16,
>> >>                             0x20C0,            // Range Minimum
>> >>                             0x20C0,            // Range Maximum
>> >>                             0x01,              // Alignment
>> >>                             0x8,               // Length
>> >>                         )
>> >>                     })
>> >>                     Return (BUF0)
>> >>                 }
>> >>             }
>> >>
>> >> It booted just fine! (comment welcome on whether or not this looks
>> >> like the correct fix)
>> >>
>> >> Unfortunately even if I get the vendor to release a new BIOS with the
>> >> DSDT modifications, rolling out BIOS updates to thousands of systems
>> >> in the field will be nearly impossible. When we roll out a new kernel
>> >> to the production systems we'll need it to work with the existing
>> >> BIOS.
>> >>
>> >> I've been searching around the linux kernel for a way to apply a quirk
>> >> specific to this board.
>> >> I've found I can do something like the following and match the Poulsbo
>> >> Host bridge and that it'll fix the problem but I don't see a decent
>> >> way of restricting it to this board.
>> >>
>> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> >> index 85f247e..1f16dbf 100644
>> >> --- a/drivers/pci/quirks.c
>> >> +++ b/drivers/pci/quirks.c
>> >> @@ -413,6 +413,17 @@ static void quirk_ati_exploding_mce(struct pci_dev *dev)
>> >> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI,
>> >> PCI_DEVICE_ID_ATI_RS100,   quirk_ati_exploding_mce);
>> >>
>> >> /*
>> >> + * Versa Logic Tiger
>> >> + */
>> >> +static void quirk_versa_logic_tiger(struct pci_dev *dev)
>> >> +{
>> >> +       dev_info(&dev->dev, "Versalogic Tiger, reserving I/O ports\n");
>> >> +       request_region(0x2000, 0x0C, "Versalogic Tiger");
>> >> +       request_region(0x20C0, 0x08, "Versalogic Tiger");
>> >> +}
>> >> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x8100, quirk_versa_logic_tiger);
>> >> +
>> >> +/*
>> >>
>> >> Any suggestions on what could be done to get a fix for this board
>> >> mainlined into the kernel? Should I give up hope and just apply a cute
>> >> embedded non-sense hack?
>> >
>> > I think your DSDT tweak is on the right track.  We have some similar
>> > things in drivers/pnp/quirks.c.  Possibly a new region could be added
>> > to an existing PNP0C02 device, maybe via dmi_check_system() to limit
>> > it to this platform.
>> >
>> > But I notice that board claims Windows compatibility, so I wonder if
>> > there's a smarter way.  I doubt that Windows would have a quirk for
>> > this specific board, so we should be able to make Linux work without a
>> > quirk, too.
>>
>> Maybe it works by accident, linux worked too until 0x2000 started
>> getting used for the I/O window. Though I'm not sure why it changed
>> just by looking at the commit.
>>
>
> The commit change the initialization ordering, especially if the ACPI processor
> driver is modular in your kernel.  Before it would register the processors on
> the module load and after the commit in question is registers them before
> enumerating the PCI bus.
>
>> > Complete dmesg logs from pre- and post-ac212b6980d8 might have a clue
>> > about what changed.  It looks like your FADT should be enough to
>> > reserve those regions via acpi_reserve_resources().  But maybe there's
>> > something wrong there, or maybe we incorrectly use that for PCI space.
>> > Does  your post-ac212b6980d8 /proc/ioports show those regions?
>> I've attached the the last good and first bad dmesg logs.
>> acpi_reserve_resources() doesn't do any good (at least in 4.1rc2)
>> because it's called after pcieport has already enabled the device and
>> put the i/o window into use.
>> I wasn't able to get ac212b6980d8 to boot without making DSDT or
>> quirks.c changes but I was able to get 4.1rc2 to boot by disabling
>> PCIEPORTBUS which keeps 0000:00:1c.0 from being enabled. I've attached
>> the ioports information. It looks like ACPI PM1a_EVT_BLK, ACPI
>> PM1a_CNT_BLK and ACPI PM_TMR show up at 2000-2003, 2004-2005 and
>> 2008-200b respectively but instead of conflicting with the PCI-bridge
>> window they show up underneath it.
>>
>> I think the key here is that the region needs to be requested before
>> bridge window is assigned.
>
> That's correct, so things don't happen in the right order now.
>
> Well, acpi_reserve_resources() is a device_initcall(), so its ordering with
> respect to other things is somewhat random.
>
> Does the patch below make any difference by any chance?
>

Yes, with this patch it boots with no problems.

>
> ---
>  drivers/acpi/osl.c |    6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> Index: linux-pm/drivers/acpi/osl.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/osl.c
> +++ linux-pm/drivers/acpi/osl.c
> @@ -182,7 +182,7 @@ static void __init acpi_request_region (
>                 request_mem_region(addr, length, desc);
>  }
>
> -static int __init acpi_reserve_resources(void)
> +static void __init acpi_reserve_resources(void)
>  {
>         acpi_request_region(&acpi_gbl_FADT.xpm1a_event_block, acpi_gbl_FADT.pm1_event_length,
>                 "ACPI PM1a_EVT_BLK");
> @@ -211,10 +211,7 @@ static int __init acpi_reserve_resources
>         if (!(acpi_gbl_FADT.gpe1_block_length & 0x1))
>                 acpi_request_region(&acpi_gbl_FADT.xgpe1_block,
>                                acpi_gbl_FADT.gpe1_block_length, "ACPI GPE1_BLK");
> -
> -       return 0;
>  }
> -device_initcall(acpi_reserve_resources);
>
>  void acpi_os_printf(const char *fmt, ...)
>  {
> @@ -1845,6 +1842,7 @@ acpi_status __init acpi_os_initialize(vo
>
>  acpi_status __init acpi_os_initialize1(void)
>  {
> +       acpi_reserve_resources();
>         kacpid_wq = alloc_workqueue("kacpid", 0, 1);
>         kacpi_notify_wq = alloc_workqueue("kacpi_notify", 0, 1);
>         kacpi_hotplug_wq = alloc_ordered_workqueue("kacpi_hotplug", 0);
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux