Hi, A gentle ping... Thanks, --Nilay On 2/16/24 18:07, Srimannarayana Murthy Maram wrote: > Hi all, > > Tested patch with upstream kernel "6.8.0-rc4" > > Issue verified on IBM power systems with manual test case has following steps > 1. Note nvme controllers for a nvme subsystem using > "nvme list-subsys" > 2. Perform nvme subsystem reset on each listed controller > under nvme subsystem one after the other, only after successful recovery. > > Verified it on power system with NVME device on normal and multipath (2 paths) configuration. > > Provided patch successfully recovered single controller(normal) and both controller(multipath) listed under nvme subsystem. > > Tested-by: Maram Srimannarayana Murthy<msmurthy@xxxxxxxxxxxxxxxxxx> > > Thank you, > Maram Srimannarayana Murthy > Sr. Test Engineer | IBM > > On 2/9/24 10:32, Nilay Shroff wrote: >> If the nvme subsyetm reset causes the loss of communication to the nvme >> adapter then EEH could potnetially recover the adapter. The detection of >> comminication loss to the adapter only happens when the nvme driver >> attempts to read an MMIO register. >> >> The nvme subsystem reset command writes 0x4E564D65 to NSSR register and >> schedule adapter reset.In the case nvme subsystem reset caused the loss >> of communication to the nvme adapter then either IO timeout event or >> adapter reset handler could detect it. If IO timeout even could detect >> loss of communication then EEH handler is able to recover the >> communication to the adapter. This change was implemented in 651438bb0af5 >> (nvme-pci: Fix EEH failure on ppc). However if the adapter communication >> loss is detected in nvme reset work handler then EEH is unable to >> successfully finish the adapter recovery. >> >> This patch ensures that, >> - nvme driver reset handler would observer pci channel was offline after >> a failed MMIO read and avoids marking the controller state to DEAD and >> thus gives a fair chance to EEH handler to recover the nvme adapter. >> >> - if nvme controller is already in RESETTNG state and pci channel frozen >> error is detected then nvme driver pci-error-handler code sends the >> correct error code (PCI_ERS_RESULT_NEED_RESET) back to the EEH handler >> so that EEH handler could proceed with the pci slot reset. >> >> Signed-off-by: Nilay Shroff<nilay@xxxxxxxxxxxxx> >> >> [ 131.415601] EEH: Recovering PHB#40-PE#10000 >> [ 131.415619] EEH: PE location: N/A, PHB location: N/A >> [ 131.415623] EEH: Frozen PHB#40-PE#10000 detected >> [ 131.415627] EEH: Call Trace: >> [ 131.415629] EEH: [c000000000051078] __eeh_send_failure_event+0x7c/0x15c >> [ 131.415782] EEH: [c000000000049bdc] eeh_dev_check_failure.part.0+0x27c/0x6b0 >> [ 131.415789] EEH: [c000000000cb665c] nvme_pci_reg_read32+0x78/0x9c >> [ 131.415802] EEH: [c000000000ca07f8] nvme_wait_ready+0xa8/0x18c >> [ 131.415814] EEH: [c000000000cb7070] nvme_dev_disable+0x368/0x40c >> [ 131.415823] EEH: [c000000000cb9970] nvme_reset_work+0x198/0x348 >> [ 131.415830] EEH: [c00000000017b76c] process_one_work+0x1f0/0x4f4 >> [ 131.415841] EEH: [c00000000017be2c] worker_thread+0x3bc/0x590 >> [ 131.415846] EEH: [c00000000018a46c] kthread+0x138/0x140 >> [ 131.415854] EEH: [c00000000000dd58] start_kernel_thread+0x14/0x18 >> [ 131.415864] EEH: This PCI device has failed 1 times in the last hour and will be permanently disabled after 5 failures. >> [ 131.415874] EEH: Notify device drivers to shutdown >> [ 131.415882] EEH: Beginning: 'error_detected(IO frozen)' >> [ 131.415888] PCI 0040:01:00.0#10000: EEH: Invoking nvme->error_detected(IO frozen) >> [ 131.415891] nvme nvme1: frozen state error detected, reset controller >> [ 131.515358] nvme 0040:01:00.0: enabling device (0000 -> 0002) >> [ 131.515778] nvme nvme1: Disabling device after reset failure: -19 >> [ 131.555336] PCI 0040:01:00.0#10000: EEH: nvme driver reports: 'disconnect' >> [ 131.555343] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'disconnect' >> [ 131.555371] EEH: Unable to recover from failure from PHB#40-PE#10000. >> [ 131.555371] Please try reseating or replacing it >> [ 131.556296] EEH: of node=0040:01:00.0 >> [ 131.556351] EEH: PCI device/vendor: 00251e0f >> [ 131.556421] EEH: PCI cmd/status register: 00100142 >> [ 131.556428] EEH: PCI-E capabilities and status follow: >> [ 131.556678] EEH: PCI-E 00: 0002b010 10008fe3 00002910 00436044 >> [ 131.556859] EEH: PCI-E 10: 10440000 00000000 00000000 00000000 >> [ 131.556869] EEH: PCI-E 20: 00000000 >> [ 131.556875] EEH: PCI-E AER capability register set follows: >> [ 131.557115] EEH: PCI-E AER 00: 14820001 00000000 00400000 00462030 >> [ 131.557294] EEH: PCI-E AER 10: 00000000 0000e000 000002a0 00000000 >> [ 131.557469] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 >> [ 131.557523] EEH: PCI-E AER 30: 00000000 00000000 >> [ 131.558807] EEH: Beginning: 'error_detected(permanent failure)' >> [ 131.558815] PCI 0040:01:00.0#10000: EEH: Invoking nvme->error_detected(permanent failure) >> [ 131.558818] nvme nvme1: failure state error detected, request disconnect >> [ 131.558839] PCI 0040:01:00.0#10000: EEH: nvme driver reports: 'disconnect' >> --- >> drivers/nvme/host/pci.c | 16 +++++++++++++--- >> 1 file changed, 13 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c >> index c1d6357ec98a..a6ba46e727ba 100644 >> --- a/drivers/nvme/host/pci.c >> +++ b/drivers/nvme/host/pci.c >> @@ -2776,6 +2776,14 @@ static void nvme_reset_work(struct work_struct *work) >> out_unlock: >> mutex_unlock(&dev->shutdown_lock); >> out: >> + /* >> + * If PCI recovery is ongoing then let it finish first >> + */ >> + if (pci_channel_offline(to_pci_dev(dev->dev))) { >> + dev_warn(dev->ctrl.device, "PCI recovery is ongoing so let it finish\n"); >> + return; >> + } >> + >> /* >> * Set state to deleting now to avoid blocking nvme_wait_reset(), which >> * may be holding this pci_dev's device lock. >> @@ -3295,9 +3303,11 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev, >> case pci_channel_io_frozen: >> dev_warn(dev->ctrl.device, >> "frozen state error detected, reset controller\n"); >> - if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING)) { >> - nvme_dev_disable(dev, true); >> - return PCI_ERS_RESULT_DISCONNECT; >> + if (nvme_ctrl_state(&dev->ctrl) != NVME_CTRL_RESETTING) { >> + if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING)) { >> + nvme_dev_disable(dev, true); >> + return PCI_ERS_RESULT_DISCONNECT; >> + } >> } >> nvme_dev_disable(dev, false); >> return PCI_ERS_RESULT_NEED_RESET;