On Mon, Oct 17, 2022 at 10:04:26AM +0000, Chaitanya Kulkarni wrote: > On 10/17/22 02:50, Ming Lei wrote: > > On Mon, Oct 17, 2022 at 09:30:47AM +0000, Chaitanya Kulkarni wrote: > >> > >>>> + /* > >>>> + * Unblock any pending dispatch I/Os before we destroy the device. > >>>> + * From null_destroy_dev()->del_gendisk() will set GD_DEAD flag > >>>> + * causing any new I/O from __bio_queue_enter() to fail with -ENODEV. > >>>> + */ > >>>> + blk_mq_unquiesce_queue(nullb->q); > >>>> + > >>>> + null_destroy_dev(nullb); > >>> > >>> destroying device is never good cleanup for handling timeout/abort, and it > >>> should have been the last straw any time. > >>> > >> > >> That is exactly why I've added the rq_abort_limit, so until the limit > >> is not reached null_abort_work() will not get scheduled and device is > >> not destroyed. > > > > I meant destroying device should only be done iff the normal abort handler > > can't recover the device, however, your patch simply destroys device > > without running any abort handling. > > > > I did not understand your comment, can you please elaborate on exactly > where and which abort handlers needs to be called in this patch before > null_destroy_nullb() ? In case of request timeout, there may be something wrong which needs to be recovered. > > the objective of this patch it to simulate the teardown scenario > from timeout handler so it can get tested on regular basis with > null_blk ... Why does teardown scenario have to be triggered for timeout? That looks you think teardown & destroying device for timeout is one normal and common way, but I think it is not, the device shouldn't be removed if it still can work. I have got such kind of complaints of disk disappeared just by request timeout, such as, nvme-pci. thanks, Ming