Jean Delvare wrote: > Le mercredi 21 octobre 2009, Alexander Huemer a écrit : > >> Jean Delvare wrote: >> >>> OK, here I am, sorry for the delay. I've read the discussion thread. >>> Here are the few data points I can offer, in the hope it will help: >>> >>> * While the i2c-i801 driver received some changes in kernel 2.6.30, >>> none of these are related to PCI nor interrupts. So as the problem >>> is new in kernel 2.6.30, the i2c-i801 driver alone is unlikely to >>> cause it. This may, however, be a combination of something i2c-i801 >>> does and something the pci subsystem does since kernel 2.6.30. For >>> this reason, I would still recommend a bisection if the problem can >>> be reliably reproduced. I know it takes time, but it is always >>> easier to fix a bug when we know which commit introduced it. >>> >>> * The i2c-i801 driver does _not_ make use of interrupts. It is >>> poll-based (I am not exactly proud of that, but that's the way it >>> is.) >>> >>> #define ENABLE_INT9 0 /* set to 0x01 to enable - untested */ >>> >>> So I am very surprised to read that this driver would cause an IRQ >>> storm. >>> >>> * One thing the i2c-i801 driver does on the PCI device is: >>> >>> err = pci_enable_device(dev); >>> >>> I presume this is what causes the following message in dmesg: >>> >>> i801_smbus 0000:00:1f.3: PCI INT B -> GSI 23 (level, low) -> IRQ 23 >>> >>> Basically, even though the driver doesn't make use of interrupts, >>> the IRQ is still registered because this is how the hardware is >>> setup. >>> >>> As a conclusion, I suspect that 2 things may be happening: either >>> the SMBus is triggering interrupts when told not to. The ICH6 is a >>> bit different from all the other supported chips, I'll double check >>> > > My bad, it's an 63xxESB-based board, not ICH6. I must have been > mixing data from a different bug. > > >>> if we may have missed something. Or, something else is triggering >>> SMBus transactions. SMI and ACPI come to mind. If this is the case >>> then you do not want to use i2c-i801 on this motherboard. >>> >>> Questions to Alexander : >>> >>> * Can I please see the output of "sensors" on your system? >>> * What are the brand and model of your motherboard? >>> * Can we get an acpidump for your system? >>> >>> >>> >> many thanks for your response. i appreciate that. >> first, the data you requested: >> >> sensors: http://xx.vu/~ahuemer/sensors-ahuemer-20091021.txt >> acpidump: http://xx.vu/~ahuemer/acpidump-ahuemer-20091021.txt >> > > The good news is that I can't see any access to the SMBus in the > ACPI tables. Nothing can be said about the SMIs though, without an > intimate knowledge of the BIOS. > > >> motherboard: tyan tempest i5400pw/s5397 with one intel xeon e5420. >> >> the output of sensors was made _without_ i801_smbus in the kernel. >> > > Then please once again with it. My whole point was to know whether > there was any hardware monitoring chip connected to the SMBus. Your > initial kernel configuration suggests that you have a W83793G chip > there. > > >> i noticed that the data of w83627hf-isa-0290 is quite weird. i do not >> have an explanation for that. >> > > I do. This happens when the manufacturer decides that the hardware > monitoring features of the Super-I/O are insufficient for their > needs. They add a dedicated chip for the hardware monitoring. This > is particularly frequent on server boards from Tyan and SuperMicro. > Ideally they would _also_ disable the feature on the Super-I/O side, > but often then do not, so the driver still loads, but outputs > garbage. > > You can see the following messages in your log: > [ 3.878703] w83627hf w83627hf.656: Enabling temp2, readings might not make sense > [ 3.881708] w83627hf w83627hf.656: Enabling temp3, readings might not make sense > This is a good hint that this is the case (if the nonsensical data > displayed by "sensors" wasn't enough to convince you.) > > So you should stop loading/including kernel module w83627hf. > > >> if a bisection is what will bring light into this, i am willing to take >> the time. >> so that would be a bisection between 2.6.29 and 2.6.30 ? >> a quicker test case would be good for that, but i don't have one yet, >> just the compilation of gcc, which takes time, even on this machine with >> tmpfs and ccache. >> > > here is the output you requested: http://xx.vu/~ahuemer/sensors_ahuemer_with_i801_20091026.txt i am currently in the middle of a bisection between 2.6.29 and 2.6.30, 8 steps left. many thanks for the info on hardware monitoring. i'll report back when bisection is finished. regards -alex -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html