On Tue, May 14, 2013 at 5:01 PM, Eric W. Biederman <ebiederm at xmission.com> wrote: > > Yes this does seem to be all over the place, and memory corruption > probably caused by ongoing-dma seems like a reasonable hypothesis. Thank goodness it's not just me! :-) > > The easy first thing to try is to remove all of your kernel modules > before you reboot with kexec. Not infrequently the module remove path > is better tested than the device shutdown path. I'm trying this now. In one panic, the pte referenced was 0x100010000000000 which sure looks a whole like someone wrote his registers in there. It certainly doesn't look like a valid pte. So far, unloading pata_acpi and pata_amd seem to have eliminated the ACPI exception messages. I believe that this resets the device properly. Unfortunately, it looks like lots of drivers don't implement the pci_driver->shutdown call, so it would make sense that this is a relatively widespread problem. --dlloyd