On Fri, 2006-11-24 at 23:34 -0800, Yong Lee wrote: > Hi all, > > I’m hoping that someone out there can lend me a hand with a problem that we > were seeing. I’m not very familiar with the acpi tool so please bear with > me. > > We had an outage where we could not ssh into our web server and we had to do > a reboot from our console to get things running again. It looks like an > acpi problem and I’m trying to figure out what was going on. Was ACPI going > crazy or was it trying to report a problem condition that we were not aware > of. > > What we saw in the dmesg log was this : > > shpchp: Address64 -------- Resource unparsed > shpchp: acpi_pciehprm:\_SB_.PCI0.PBLO OSHP fails=0x5 > shpchp: acpi_shpchprm: Slot sun(0) at s:b:d:f=0x00:04:1f:00 > shpchp: acpi_pciehprm:\_SB_.PCI0.PBLO OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.PBLO OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.PBLO OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.PBLO OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.PBLO OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.PBLO OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.PBLO OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.VPR0 OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.VPR0 OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.VPR0 OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.VPR0 OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.VPR0 OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.VPR0 OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.VPR0 OSHP fails=0x5 > shpchp: acpi_pciehprm:\_SB_.PCI0.VPR0 OSHP fails=0x5 > shpchp: shpc_init : shpc_cap_offset == 0 > shpchp: shpc_init : shpc_cap_offset == 0 > shpchp: shpc_init : shpc_cap_offset == 0 > shpchp: shpc_init : shpc_cap_offset == 0 > shpchp: shpc_init : shpc_cap_offset == 0 > shpchp: shpc_init : shpc_cap_offset == 0 > shpchp: shpc_init : shpc_cap_offset == 0 > shpchp: shpc_init : shpc_cap_offset == 0 > shpchp: shpc_init : shpc_cap_offset == 0 > shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 > > During the time of the outage we saw from our router logs that the > connection to the server was going up and down. > > There was a lot of other messages on the console but our sysadmin guy didn’t > capture this. Hmm, so that may not be the root cause of your problems? > > We’re running redhat linux 2.6.9-34.0.2.ELsmp on intel xeon processors. > We have 2 intel nic cards : Intel Corporation 82541GI/PI Gigabit Ethernet > Controller (rev 05) > > Any light you can shed on this problem would be great. Note that while the > kacpid kernel thread is running the acpid daemon was shut off during this > incident. If the pci hotplug module (shpchp, difficult to spell...) really causes this it might be kernel or a BIOS bug. If this is a production machine that is already running for a while, I would not risk a BIOS update or waste time with kernel compilations. Best/simplest would be to remove the module out of /lib/modules/xy/kernel/drivers/pci/hotplug/shpchp.ko directory if you do not need PCI hotplug urgently. Hope that works... Thomas - To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html