Re: [PATCH 1/4] block: rework bio splitting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/27/24 02:37, Christoph Hellwig wrote:
> The current setup with bio_may_exceed_limit and __bio_split_to_limits
> is a bit of a mess.
> 
> Change it so that __bio_split_to_limits does all the work and is just
> a variant of bio_split_to_limits that returns nr_segs.  This is done
> by inlining it and instead have the various bio_split_* helpers directly
> submit the potentially split bios.
> 
> To support btrfs, the rw version has a lower level helper split out
> that just returns the offset to split.  This turns out to nicely clean
> up the btrfs flow as well.
> 
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> ---
>  block/blk-merge.c   | 146 +++++++++++++++++---------------------------
>  block/blk-mq.c      |  11 ++--
>  block/blk.h         |  63 +++++++++++++------
>  fs/btrfs/bio.c      |  30 +++++----
>  include/linux/bio.h |   4 +-
>  5 files changed, 125 insertions(+), 129 deletions(-)
> 
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index de5281bcadc538..c7222c4685e060 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -105,9 +105,33 @@ static unsigned int bio_allowed_max_sectors(const struct queue_limits *lim)
>  	return round_down(UINT_MAX, lim->logical_block_size) >> SECTOR_SHIFT;
>  }
>  
> -static struct bio *bio_split_discard(struct bio *bio,
> -				     const struct queue_limits *lim,
> -				     unsigned *nsegs, struct bio_set *bs)
> +static struct bio *bio_submit_split(struct bio *bio, int split_sectors)

Why not "unsigned int" for split_sectors ? That would avoid the need for the
first "if" of the function. Note that bio_split() also takes an int for the
sector count and also checks for < 0 count with a BUG_ON(). We can clean that up
too. BIOs sector count is unsigned int...

> +{
> +	if (unlikely(split_sectors < 0)) {
> +		bio->bi_status = errno_to_blk_status(split_sectors);
> +		bio_endio(bio);
> +		return NULL;
> +	}
> +
> +	if (split_sectors) {

May be the simple case should come first ? E.g.:

	if (!split_sectors)
		return bio;

But shouldn't this check be:

	if (split_sectors >= bio_sectors(bio))
		return bio;

?

[...]

> +/**
> + * __bio_split_to_limits - split a bio to fit the queue limits
> + * @bio:     bio to be split
> + * @lim:     queue limits to split based on
> + * @nr_segs: returns the number of segments in the returned bio
> + *
> + * Check if @bio needs splitting based on the queue limits, and if so split off
> + * a bio fitting the limits from the beginning of @bio and return it.  @bio is
> + * shortened to the remainder and re-submitted.
> + *
> + * The split bio is allocated from @q->bio_split, which is provided by the
> + * block layer.
> + */
> +static inline struct bio *__bio_split_to_limits(struct bio *bio,
> +		const struct queue_limits *lim, unsigned int *nr_segs)
>  {
>  	switch (bio_op(bio)) {
> +	default:
> +		if (bio_may_need_split(bio, lim))
> +			return bio_split_rw(bio, lim, nr_segs);
> +		*nr_segs = 1;
> +		return bio;

Wouldn't it be safer to move the if check inside bio_split_rw(), so that here we
can have:

		return bio_split_rw(bio, lim, nr_segs);

And at the top of bio_split_rw() add:

	if (!bio_may_need_split(bio, lim)) {
		*nr_segs = 1;
		return bio;
	}

so that bio_split_rw() always does the right thing ?

>  	case REQ_OP_DISCARD:
>  	case REQ_OP_SECURE_ERASE:
> +		return bio_split_discard(bio, lim, nr_segs);
>  	case REQ_OP_WRITE_ZEROES:
> -		return true; /* non-trivial splitting decisions */
> -	default:
> -		break;
> +		return bio_split_write_zeroes(bio, lim, nr_segs);
>  	}
> -
> -	/*
> -	 * All drivers must accept single-segments bios that are <= PAGE_SIZE.
> -	 * This is a quick and dirty check that relies on the fact that
> -	 * bi_io_vec[0] is always valid if a bio has data.  The check might
> -	 * lead to occasional false negatives when bios are cloned, but compared
> -	 * to the performance impact of cloned bios themselves the loop below
> -	 * doesn't matter anyway.
> -	 */
> -	return lim->chunk_sectors || bio->bi_vcnt != 1 ||
> -		bio->bi_io_vec->bv_len + bio->bi_io_vec->bv_offset > PAGE_SIZE;
>  }

-- 
Damien Le Moal
Western Digital Research





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux