Re: [PATCH 11/19] block: implement bio helper to add iter bvec pages to bio

Jens Axboe <axboe@xxxxxxxxx> · Tue, 26 Feb 2019 19:03:39 -0700

On 2/26/19 6:57 PM, Jens Axboe wrote:
> Speaking of this, I took a quick look at why we've now regressed a lot
> on IOPS perf with the multipage work. It looks like it's all related to
> the (much) fatter setup around iteration, which is related to this very
> topic too.
> 
> Basically setup of things like bio_for_each_bvec() and indexing through
> nth_page() is MUCH slower than before.
> 
> We need to do something about this, it's like tossing out months of
> optimizations.

See attached quick profile. This is doing 4k IOS, btw, so just single
segment requests. Yet we spend most of the time in blk_rq_map_sg(), and
blk_queue_split() is not far behind. Between just the two of them,
that's more than 8% of the total CPU utilization. As this test case is
running on just one CPU for all of it, that's a LOT of wasted time.

-- 
Jens Axboe

Attachment:
perf.jpg

Description: JPEG image