Re: [PATCH] block: advance by bvec's length for bio_for_each_bvec

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 28, 2019 at 05:58:32AM -0800, Christoph Hellwig wrote:
> On Thu, Feb 28, 2019 at 11:24:21AM +0800, Ming Lei wrote:
> > bio_for_each_bvec is used in fast path of bio splitting and sg mapping,
> > and what we want to do is to iterate over multi-page bvecs, instead of pages.
> > However, bvec_iter_advance() is invisble for this requirement, and
> > always advance by page size.
> > 
> > This way isn't efficient for multipage bvec iterator, also bvec_iter_len()
> > isn't as fast as mp_bvec_iter_len().
> > 
> > So advance by multi-page bvec's length instead of page size for bio_for_each_bvec().
> > 
> > More than 1% IOPS improvement can be observed in io_uring test on null_blk.
> 
> We've been there before, and I still insist that there is not good
> reason ever to clamp the iteration to page size in bvec_iter_advance.
> Callers that iterate over it already do that in the callers.
> 
> So here is a resurretion and rebase of my patch from back then to
> just do the right thing:
> 
> diff --git a/include/linux/bvec.h b/include/linux/bvec.h
> index 2c32e3e151a0..cf06c0647c4f 100644
> --- a/include/linux/bvec.h
> +++ b/include/linux/bvec.h
> @@ -112,14 +112,15 @@ static inline bool bvec_iter_advance(const struct bio_vec *bv,
>  	}
>  
>  	while (bytes) {
> -		unsigned iter_len = bvec_iter_len(bv, *iter);
> -		unsigned len = min(bytes, iter_len);
> +		const struct bio_vec *cur = bv + iter->bi_idx;
> +		unsigned len = min3(bytes, iter->bi_size,
> +				    cur->bv_len - iter->bi_bvec_done);
>  
>  		bytes -= len;
>  		iter->bi_size -= len;
>  		iter->bi_bvec_done += len;
>  
> -		if (iter->bi_bvec_done == __bvec_iter_bvec(bv, *iter)->bv_len) {
> +		if (iter->bi_bvec_done == cur->bv_len) {
>  			iter->bi_bvec_done = 0;
>  			iter->bi_idx++;
>  		}

Yeah, this change is the correct thing to do, and there shouldn't be
performance drop with this patch for Jens' test case, I guess.

Thanks,
Ming



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux