On Mon, 9 Apr 2018 09:41:49 -0400, Jason Andryuk wrote: > On Sat, Apr 7, 2018 at 4:32 AM, Jean Delvare <jdelvare@xxxxxxx> wrote: > > Hi Jason, > > > > On Fri, 6 Apr 2018 11:18:02 -0400, Jason Andryuk wrote: > >> I dumped the data before load, during (while loaded) and after unload. > >> The ioports are all uninitialized 0xff before loading the module and > >> do not revert at unload. PCI config space also does not totally > >> revert. > >> > >> [root@localhost ~]# diff isadump.before isadump.during > >> 2,3c2,3 > >> < efa0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > >> < efb0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > > > > I guess the PCI device is not enabled at that point, so I/O resources > > are not mapped. > > Should the PCI device be disabled when the module is removed? It was the case at some point, but was found to cause problems on some systems, so we stopped doing that: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d6fcb3b9cf776e3f748709cd3091a72cb3855c29 When the driver was converted to managed resources, the behavior was preserved by calling pcim_pin_device() right after pcim_enable_device(): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fef220da43d8537ed7d11da4db17b958cd779887 You could try removing the call to pcim_pin_device() to see if it solves your problem, however my hopes are low because I can't see how this would be the problem when Host Notify is enabled and not be a problem when Host Notify is disabled. But well, if you have time, still worth a try I suppose. > >> --- > >> > efa0: 00 00 08 75 a7 00 00 00 00 44 00 00 1c 00 07 07 > >> > efb0: 00 01 00 00 58 00 00 00 00 00 00 00 00 00 00 00 > >> [root@localhost ~]# diff isadump.during isadump.after > >> 2c2 > >> < efa0: 00 00 08 75 a7 00 00 00 00 44 00 00 1c 00 07 07 > >> --- > >> > efa0: 40 00 08 75 a7 00 00 00 00 44 00 00 1c 00 07 07 > > > > Only difference is bit 6 of register 0x00, INUSE_STS. That's expected > > and not specifically related to the Host Notify feature. > > > >> [root@localhost ~]# diff lspci.before lspci.during > >> 3c3 > >> < 00: 86 80 23 9d 00 00 88 02 21 00 05 0c 00 00 00 00 > >> --- > >> > 00: 86 80 23 9d 03 00 80 02 21 00 05 0c 00 00 00 00 > > > > Hmm. I expected differences in the device specific configuration > > registers (0x40 and above.) > > > > Bits 0 and 1 of PCI Command Register get set: I/O space enabled, memory > > space enabled. This must be the result of pcim_enable_device(). > > Dmesg says "i801_smbus 0000:00:1f.4: enabling device (0000 -> 0003)" > which is pci_enable_resources writing to the command register. Yes, makes sense. > > Bit 11 of PCI Status Register gets cleared: Signaled Target Abort > > (STA). No idea what this is all about, my datasheet says it should > > always be 0. > > Should this be bit 3 of the PCI Status Register (Little Endian)? Doh, you're right, sorry. I must have swapped twice on my way to checking the datasheet... Bit 3 is Interrupt Status, that's a lot more interesting. > Which datasheet are you using, and where can I find it? Intel 100 Series Chipset Family PCH Datasheet, Vol. 2, as downloaded from: https://www.intel.fr/content/www/fr/fr/chipsets/100-series-chipset-datasheet-vol-2.html > >> [root@localhost ~]# diff lspci.during lspci.after > >> [root@localhost ~]# #No difference > >> > >> The full session of commands is below. I haven't analyzed any of > >> these values since I am unfamiliar with them. > >> > >> I hope this helps. > > > > Not really, I'm afraid. I hopped for something more obviously > > related to what you observe. > > > > For clarity, were the tests above done with or without > > disable_features=0x20? Does it make any difference if you change? > > The previous data was from running without disable_features, so Host > Notify was active. Below is the diff of data from a run with > disable_features=0x20. > > The isadump changed part of 0xefa2-0xefa4 and 0xefb0-0xefbf to 0s. > > [root@localhost ~]# diff isadump.before isadump.during > 2,3c2,3 > < efa0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > < efb0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > --- > > efa0: 00 00 00 00 00 00 00 00 00 44 00 00 1c 00 07 07 > > efb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xefa2 is Host Control, 0xefa3 is Host Command, 0xefa4 is Transmit Slave Address. In the previous dump we could see a read from offset 0x75 of the SPD EEPROM at I2C address 0x53, which is the kind of thing the BIOS would do to configure the memory controller timings. It's not related to Host Notify. I wonder if this difference is reproducible. It could depend on something else, like cold boot vs. reboot, or entering the BIOS setup screen. 0xefb1 is Slave Command, 0xefb4 is Notify Device Address. Both are directly related to the Host Notify feature, so it is expected that they are no longer set with disable_features=0x20. Note that the Notify Device Address was set to I2C device at address 0x2C in the previous dump. So most likely, you DO have a device triggering Host Notify interrupts on your laptop. What is pretty interesting in your previous post, and I failed to notice before, is that bit 0 of register 0xefb1 was still set after the module was unloaded. The original register value is supposed to be restored in i801_disable_host_notify(), and the dump you just provided indicates that Host Notify is not enabled by the BIOS on your machine, so the value of this bit should have been saved, and then restored, to 0. This is not happening. I think it would be worth instrumenting the i2c-i801 driver to check what exactly is going on. Something like: --- drivers/i2c/busses/i2c-i801.c | 9 +++++++++ 1 file changed, 9 insertions(+) --- linux-4.16.orig/drivers/i2c/busses/i2c-i801.c 2018-04-09 14:54:05.179509697 +0200 +++ linux-4.16/drivers/i2c/busses/i2c-i801.c 2018-04-09 18:09:28.843867439 +0200 @@ -967,6 +967,8 @@ static void i801_enable_host_notify(stru return; priv->original_slvcmd = inb_p(SMBSLVCMD(priv)); + dev_info(&adapter->dev, "Original SMBSLVCMD value: %02X\n", + (unsigned int)priv->original_slvcmd); if (!(SMBSLVCMD_HST_NTFY_INTREN & priv->original_slvcmd)) outb_p(SMBSLVCMD_HST_NTFY_INTREN | priv->original_slvcmd, @@ -974,6 +976,9 @@ static void i801_enable_host_notify(stru /* clear Host Notify bit to allow a new notification */ outb_p(SMBSLVSTS_HST_NTFY_STS, SMBSLVSTS(priv)); + + dev_info(&adapter->dev, "SMBSLVCMD value after initialization: %02X\n", + (unsigned int)inb_p(SMBSLVCMD(priv))); } static void i801_disable_host_notify(struct i801_priv *priv) @@ -981,7 +986,11 @@ static void i801_disable_host_notify(str if (!(priv->features & FEATURE_HOST_NOTIFY)) return; + dev_info(&priv->adapter.dev, "SMBSLVCMD value at removal time: %02X\n", + (unsigned int)inb_p(SMBSLVCMD(priv))); outb_p(priv->original_slvcmd, SMBSLVCMD(priv)); + dev_info(&priv->adapter.dev, "SMBSLVCMD reset to: %02X\n", + (unsigned int)inb_p(SMBSLVCMD(priv))); } static const struct i2c_algorithm smbus_algorithm = { Could you try this with the Host Notify feature enabled, and report the values that are logged? > [root@localhost ~]# diff isadump.during isadump.after > 2c2 > < efa0: 00 00 00 00 00 00 00 00 00 44 00 00 1c 00 07 07 > --- > > efa0: 40 00 00 00 00 00 00 00 00 44 00 00 1c 00 07 07 > > The PCI config space not displayed in the diff output is identical to > the Host Notify run. > > [root@localhost ~]# diff lspci.before lspci.during > 2c2 > < 00: 86 80 23 9d 00 00 80 02 21 00 05 0c 00 00 00 00 > --- > > 00: 86 80 23 9d 03 00 80 02 21 00 05 0c 00 00 00 00 Interrupt Status bit isn't set here. It was set in the Host Notify enabled case. That's interesting. > [root@localhost ~]# diff lspci.during lspci.after > [root@localhost ~]# # No Difference -- Jean Delvare SUSE L3 Support