Re: [PATCHv3 2/2] nvme: cancel requests for real

Ming Lei <tom.leiming@xxxxxxxxx> · Sat, 30 May 2020 06:46:25 +0800

On Sat, May 30, 2020 at 6:32 AM Keith Busch <kbusch@xxxxxxxxxx> wrote:
>
> On Sat, May 30, 2020 at 06:23:08AM +0800, Ming Lei wrote:
> > On Fri, May 29, 2020 at 9:22 PM Keith Busch <kbusch@xxxxxxxxxx> wrote:
> > > seconds. Your series will reset that broken controller indefinitely.
> > > Which of those options is better?
> >
> > Removing controller is very horrible, because it becomes a brick
> > basically, together
> > with data loss. And we should retry enough before killing the controller.
> >
> > Mys series doesn't reset indefinitely since every request is just
> > retried limited
> > times(default is 5), at least chance should be provided to retry
> > claimed times for IO
> > requests.
>
> Once the 5th retry is abandoned for all IO in the scheduled scan_work,
> the reset will succeed and schedule scan_work, which will revalidate
> disks, which will send new IO, which will timeout, then reset and
> repeat...

Firstly, we can recoganize this situation easily during reset, and give up
after we have retried claimed times, will do that in V2.

Secondly, not sure revalidate will send new IO since all previous IOs have
been failed.

Thanks,
Ming Lei