Eric W. Biederman wrote: > "Kok, Auke" <auke-jan.h.kok at intel.com> writes: > >> Ingo Molnar wrote: >>> * Kok, Auke <auke-jan.h.kok at intel.com> wrote: >>> >>>>>> BUG: at drivers/pci/msi.c:611 pci_enable_msi() >>>>> I would poke Eric Biederman(sp?) about this one. Maybe its even solved by >>>>> the MSI-enable-related patch he posted in the past 24-48 hours. >>>> I tried the 3-patch series "[PATCH 0/3] Basic msi bug fixes.." and they fix >>>> this problem for me. Were you expecting the OOPS in the first place? [...] >>> the bug was the warning message (a WARN_ON()) above - not an oops. So that >>> warning message is gone in your testing? >> yes. > > Sorry for the slow delay. I was out of town for my brothers wedding the last few > days. > > I wasn't exactly expecting the WARN_ON to trigger. What I fixed was > an inconsistency in handling our state bits. Fixing that > inconsistency appears to have fixed the e1000 usage scenario mostly by > accident. > > The basic issue is that pci_save_state saves the current msi state > along with other registers, and then the e1000 driver goes and > disables the msi irq after we have saved the irq state as on. > > My code notices that the msi irq was disabled before restore time, so > it skips the restore. However we now have a leak of the msi saved cap > because we are not freeing it. > > This leaves with some basic questions. > - Does it make sense for suspend/resume methods to request/free irqs? > - Does it make sense for suspend/resume methods to allocate/free msi irqs? > - Do we want pci_save/restore_cap to save/restore msi state? > > The path of least resistance is to just free the extra state and we > are good. I'm just not quite certain that is sane and it has been a > long day. we used to have a lengthy e1000_pci_save|restore_state in our code, which is now gone, so I'm all for that. A separate pci_save_pxie|msi(x)_state for every driver seems completely unnecessary. I can't think of a use case where saving+restoring everything hurts. That's what you want I presume. We currently free all irq's and msi before going into suspend in e1000, and I think that is probably a good thing, somehow I can think of bad things happening if we dont, but I admit that I haven't tried it without alloc/free. We do this in e100 as well and it works. Another motivation would be to leave this up to the driver: if the driver chooses to free/alloc interrupts because it makes sense, you probably would want to keep that choice available. Devices that don't need this can skip the alloc/free, but leave the choice open for others. hth Auke