On 05/14/2015 10:48 AM, Tejun Heo wrote: > Hello, Brian. > > On Thu, May 14, 2015 at 10:44:18AM -0500, Brian King wrote: >> So, on the Power platform, the pci_error_handlers map to our EEH recovery. > > What's EEH? It stands for "Extended Error Handling". http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/PCI/pci-error-recovery.txt http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/powerpc/eeh-pci-error-recovery.txt > >> In that case, without this patch, if we hit any sort of PCIe error, we >> won't be able to recover and we'll lose all access to the ahci disks. >> This could be the adapter trying to access an invalid DMA address due >> to a transient hardware issue, or it could be due to a driver bug giving >> the adapter an invalid address. It could also be other various PCIe >> errors that cause our PCIe bridge chip to isolate the device and >> place it into the EEH "frozen" state. When this occurs, if the driver >> associated with the hardware does not have these handlers registered, >> powerpc arch kernel code will hotplug remove the adapter, recover the >> adapter, then hotplug add it back. This works OK for some devices, >> but generally not so well for storage devices with mounted filesystems, >> which would tend to go readonly in this case. > > I think the above, with more details on how the error handling > actually works (IOW what it does), should be in the patch description > and comments. Wen, can you please update the patch with more > information? Agreed. Thanks, Brian -- Brian King Power Linux I/O IBM Linux Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html