Re: [PATCHv2 2/2] nvme: Complete all stuck requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/28/2017 05:57 PM, Keith Busch wrote:
> On Tue, Feb 28, 2017 at 08:42:19AM +0100, Artur Paszkiewicz wrote:
>>
>> I'm observing the same thing when hibernating during mdraid resync on
>> nvme - it hangs in blk_mq_freeze_queue_wait() after "Disabling non-boot
>> CPUs ...".
> 
> The patch guarantees forward progress for blk-mq's hot-cpu notifier on
> nvme request queues by failing all entered requests. It sounds like some
> part of your setup needs those requests to succeed in order to hibernate.
> 
> If your mdraid uses a stacking request_queue that submits retries while
> it's request queue is entered, that may explain how you remain stuck
> at blk_mq_freeze_queue_wait.
> 
>> This patch did not help but when I put nvme_wait_freeze()
>> right after nvme_start_freeze() it appeared to be working. Maybe the
>> difference here is that requests are submitted from a non-freezable
>> kernel thread (md sync_thread)?
> 
> Wait freeze prior to quiescing the queue is ok when the controller is
> functioning, but it'd be impossible to complete a reset if the controller
> is in a failed or degraded state.
> 
> We probably want to give those requests a chance to succeed, and I think
> we'd need to be able to timeout the freeze wait. Below are two patches
> I tested. Prior to these, the fio test would report IO errors from some
> of its jobs; no errors with these.

With these patches it works fine. I tested multiple iterations on 2
platforms and they were able to hibernate and resume without issues.

Thanks,
Artur



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux