Hi All, I am trying kexec -e with latest kernel i.e. Linux-5.5.0-rc4. Here second kernel is not able to detect/mount hard-disk having root file system (INTEL SSDSC2BB240G7). [ 279.690575] ata1: softreset failed (1st FIS failed) [ 279.695446] ata1: limiting SATA link speed to 3.0 Gbps [ 280.910020] ata1: SATA link down (SStatus 0 SControl 320) [ 282.626018] ata1: SATA link down (SStatus 0 SControl 300) [ 282.631409] ata1: link online but 1 devices misclassified, retrying [ 282.637665] ata1: reset failed (errno=-11), retrying in 9 secs [ 298.294546] ata1: failed to reset engine (errno=-5) [ 302.042967] ata1: softreset failed (1st FIS failed) [ 308.798609] ata1: failed to reset engine (errno=-5) [ 337.546605] ata1: softreset failed (1st FIS failed) [ 337.551475] ata1: limiting SATA link speed to 3.0 Gbps [ 338.766022] ata1: SATA link down (SStatus 0 SControl 320) [ 339.270943] ata1: EH pending after 5 tries, giving up I found following two workaround for this issue. A) Define ".shutdown" in driver/ata/ahci.c. reboot --> kernel_kexec() --> kernel_restart_prepare() --> device_shutdown() --> pci_device_shutdown() --> ahci_shutdown_one() --> new function diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 4bfd1b14b390..50a101002885 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -81,6 +81,7 @@ enum board_ids { static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent); static void ahci_remove_one(struct pci_dev *dev); +static void ahci_shutdown_one(struct pci_dev *dev); static int ahci_vt8251_hardreset(struct ata_link *link, unsigned int *class, unsigned long deadline); static int ahci_avn_hardreset(struct ata_link *link, unsigned int *class, @@ -606,6 +607,7 @@ static struct pci_driver ahci_pci_driver = { .id_table = ahci_pci_tbl, .probe = ahci_init_one, .remove = ahci_remove_one, + .shutdown = ahci_shutdown_one, .driver = { .pm = &ahci_pci_pm_ops, }, +static void ahci_shutdown_one(struct pci_dev *pdev) +{ + pm_runtime_get_noresume(&pdev->dev); + ata_pci_remove_one(pdev); +} + Note: After defining shutdown, error related to file-system write seen. Looks like even after device_shutdown, file system related transaction goes to disk. B)) Commenting of pci_clear_master() from pci_device_shutdown() reboot --> kernel_kexec() --> kernel_restart_prepare() --> device_shutdown() --> pci_device_shutdown() diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 0454ca0e4e3f..ddffaa9321bb 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -481,8 +481,10 @@ static void pci_device_shutdown(struct device *dev) /* * If this is a kexec reboot, turn off Bus Master bit on the @@ -491,8 +493,16 @@ static void pci_device_shutdown(struct device *dev) * If it is not a kexec reboot, firmware will hit the PCI * devices with big hammer and stop their DMA any way. */ - if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot)) - pci_clear_master(pci_dev); Here pci_dev current_state. It is "0" i.e. D0. >From A and B. Looks like even after pci_clear_master(), Some DMA transactions going on PCIe device causing device in unstable. Not sure if this is the reason and how to solve this problem. Any help/guidance will help me in moving forward. Thanks!! --prabhakar (pk)