Re: [PATCH] block: Bail out iteration functions upon SIGKILL.

Ming Lei <ming.lei@xxxxxxxxxx> · Wed, 13 Nov 2019 15:49:14 +0800

On Wed, Nov 13, 2019 at 07:11:36AM +0000, Damien Le Moal wrote:
> On 2019/11/13 15:55, Ming Lei wrote:
> > On Wed, Nov 13, 2019 at 01:54:14AM +0000, Damien Le Moal wrote:
> >> On 2019/11/12 23:48, Tetsuo Handa wrote:
> >> [...]
> >>>>> +static int blk_should_abort(struct bio *bio)
> >>>>> +{
> >>>>> +	int ret;
> >>>>> +
> >>>>> +	cond_resched();
> >>>>> +	if (!fatal_signal_pending(current))
> >>>>> +		return 0;
> >>>>> +	ret = submit_bio_wait(bio);
> >>>>
> >>>> This will change the behavior of __blkdev_issue_discard() to a sync IO
> >>>> execution instead of the current async execution since submit_bio_wait()
> >>>> call is the responsibility of the caller (e.g. blkdev_issue_discard()).
> >>>> Have you checked if users of __blkdev_issue_discard() are OK with that ?
> >>>> f2fs, ext4, xfs, dm and nvme use this function.
> >>>
> >>> I'm not sure...
> >>>
> >>>>
> >>>> Looking at f2fs, this does not look like it is going to work as expected
> >>>> since the bio setup, including end_io callback, is done after this
> >>>> function is called and a regular submit_bio() execution is being used.
> >>>
> >>> Then, just breaking the iteration like below?
> >>> nvmet_bdev_execute_write_zeroes() ignores -EINTR if "*biop = bio;" is done. Is that no problem?
> >>>
> >>> --- a/block/blk-lib.c
> >>> +++ b/block/blk-lib.c
> >>> @@ -7,6 +7,7 @@
> >>>  #include <linux/bio.h>
> >>>  #include <linux/blkdev.h>
> >>>  #include <linux/scatterlist.h>
> >>> +#include <linux/sched/signal.h>
> >>>  
> >>>  #include "blk.h"
> >>>  
> >>> @@ -30,6 +31,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> >>>  	struct bio *bio = *biop;
> >>>  	unsigned int op;
> >>>  	sector_t bs_mask;
> >>> +	int ret = 0;
> >>>  
> >>>  	if (!q)
> >>>  		return -ENXIO;
> >>> @@ -76,10 +78,14 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> >>>  		 * is disabled.
> >>>  		 */
> >>>  		cond_resched();
> >>> +		if (fatal_signal_pending(current)) {
> >>> +			ret = -EINTR;
> >>> +			break;
> >>> +		}
> >>>  	}
> >>>  
> >>>  	*biop = bio;
> >>> -	return 0;
> >>> +	return ret;
> >>
> >> This will leak a bio as blkdev_issue_discard() executes the bio only in
> >> the case "if (!ret && bio)". So that does not work as is, unless all
> >> callers of __blkdev_issue_discard() are also changed. Same problem for
> >> the other __blkdev_issue_xxx() functions.
> >>
> >> Looking more into this, if an error is returned here, no bio should be
> >> returned and we need to make sure that all started bios are also
> >> completed. So your helper blk_should_abort() did the right thing calling
> >> submit_bio_wait(). However, I Think it would be better to fail
> >> immediately the current loop bio instead of executing it and then
> >> reporting the -EINTR error, unconditionally, regardless of what the
> >> started bios completion status is.
> >>
> >> This could be done with the help of a function like this, very similar
> >> to submit_bio_wait().
> >>
> >> void bio_chain_end_wait(struct bio *bio)
> >> {
> >> 	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
> >>
> >> 	bio->bi_private = &done;
> >> 	bio->bi_end_io = submit_bio_wait_endio;
> >> 	bio->bi_opf |= REQ_SYNC;
> >> 	bio_endio(bio);
> >> 	wait_for_completion_io(&done);
> >> }
> >>
> >> And then your helper function becomes something like this:
> >>
> >> static int blk_should_abort(struct bio *bio)
> >> {
> >> 	int ret;
> >>
> >> 	cond_resched();
> >> 	if (!fatal_signal_pending(current))
> >> 		return 0;
> >>
> >> 	if (bio_flagged(bio, BIO_CHAIN))
> >> 		bio_chain_end_wait(bio);
> >> 	bio_put(bio);
> >>
> >> 	return -EINTR;
> >> }
> >>
> >> Thoughts ?
> > 
> > DISCARD request can be quite big, and any sync bio submission may cause
> > serious performance regression.
> 
> Yes indeed. But if the bio issuing loop is interrupted with discard BIOs
> already issued, I do not think there is any other choice but to wait for
> their completion before returning.

Looks I miss the check on fatal_signal_pending(), then this approach
seems fine.

> 
> > Not mention blkdev_issue_discard() may be called in non-block context.
> 
> This loop is calling cond_resched(), which checks might_sleep(). So
> certainly this function can block, no ?

Indeed, looks I misunderstood it.

Thanks,
Ming