Re: [PATCH] nvme/pci: Sync controller reset for AER slot_reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 05/10/2018 02:14 PM, Keith Busch wrote:
> On Thu, May 10, 2018 at 01:56:56PM -0500, Alex G. wrote:
>>> @@ -2681,8 +2681,15 @@ static pci_ers_result_t nvme_slot_reset(struct pci_dev *pdev)
>>>  
>>>  	dev_info(dev->ctrl.device, "restart after slot reset\n");
>>>  	pci_restore_state(pdev);
>>> -	nvme_reset_ctrl(&dev->ctrl);
>>> -	return PCI_ERS_RESULT_RECOVERED;
>>> +	nvme_reset_ctrl_sync(&dev->ctrl);
>>
>> This does wonders when nvme_reset_ctrl_sync() returns in a timely
>> manner. I was also able to get the nvme drive in a state where
>> nvme_reset_ctrl_sync() does not return. Then we end up with the device
>> lock in report_slot_reset, which, as you may imagine, is not a great thing.
> 
> It never returns? That shouldn't happen. There are cases where it may take
> a very long time, depending on what the controller reports in CAP.TO. The
> only other case it may stall is if the controller never responds to the
> initialization admin commands, but that should delay by 60 seconds under
> default parameters.

Took 28 minutes before I gave up and rebooted the machine. Maybe I
should have waited 30.
Even 60 seconds seems like a terribly long time to wait in AER. Simple
stuff like block IO and 'nvme list' hangs in kernel space this entire
time. I can raise a separate issue once I find a reliable way to repro.

Alex



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux