Re: [PATCH V4] block: optimize for small block size IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 04, 2019 at 09:30:02PM -0500, Kent Overstreet wrote:
> On Tue, Nov 05, 2019 at 10:20:46AM +0800, Ming Lei wrote:
> > On Mon, Nov 04, 2019 at 09:11:30PM -0500, Kent Overstreet wrote:
> > > On Tue, Nov 05, 2019 at 09:11:35AM +0800, Ming Lei wrote:
> > > > On Mon, Nov 04, 2019 at 01:42:17PM -0500, Kent Overstreet wrote:
> > > > > On Mon, Nov 04, 2019 at 11:23:42AM -0700, Jens Axboe wrote:
> > > > > > On 11/4/19 11:17 AM, Kent Overstreet wrote:
> > > > > > > On Mon, Nov 04, 2019 at 10:15:41AM -0800, Christoph Hellwig wrote:
> > > > > > >> On Mon, Nov 04, 2019 at 01:14:03PM -0500, Kent Overstreet wrote:
> > > > > > >>> On Sat, Nov 02, 2019 at 03:29:11PM +0800, Ming Lei wrote:
> > > > > > >>>> __blk_queue_split() may be a bit heavy for small block size(such as
> > > > > > >>>> 512B, or 4KB) IO, so introduce one flag to decide if this bio includes
> > > > > > >>>> multiple page. And only consider to try splitting this bio in case
> > > > > > >>>> that the multiple page flag is set.
> > > > > > >>>
> > > > > > >>> So, back in the day I had an alternative approach in mind: get rid of
> > > > > > >>> blk_queue_split entirely, by pushing splitting down to the request layer - when
> > > > > > >>> we map the bio/request to sgl, just have it map as much as will fit in the sgl
> > > > > > >>> and if it doesn't entirely fit bump bi_remaining and leave it on the request
> > > > > > >>> queue.
> > > > > > >>>
> > > > > > >>> This would mean there'd be no need for counting segments at all, and would cut a
> > > > > > >>> fair amount of code out of the io path.
> > > > > > >>
> > > > > > >> I thought about that to, but it will take a lot more effort.  Mostly
> > > > > > >> because md/dm heavily rely on splitting as well.  I still think it is
> > > > > > >> worthwhile, it will just take a significant amount of time and we
> > > > > > >> should have the quick improvement now.
> > > > > > > 
> > > > > > > We can do it one driver at a time - driver sets a flag to disable
> > > > > > > blk_queue_split(). Obvious one to do first would be nvme since that's where it
> > > > > > > shows up the most.
> > > > > > > 
> > > > > > > And md/md do splitting internally, but I'm not so sure they need
> > > > > > > blk_queue_split().
> > > > > > 
> > > > > > I'm a big proponent of doing something like that instead, but it is a
> > > > > > lot of work. I absolutely hate the splitting we're doing now, even
> > > > > > though the original "let's work as hard as we add add page time to get
> > > > > > things right" was pretty abysmal as well.
> > > > > 
> > > > > Last I looked I don't think it was going to be that bad, just needed a bit of
> > > > > finesse. We just need to be able to partially process a request in e.g.
> > > > > nvme_map_data(), and blk_rq_map_sg() needs to be modified to only map as much as
> > > > > will fit instead of popping an assertion.
> > > > 
> > > > I think it may not be doable.
> > > > 
> > > > blk_rq_map_sg() is called by drivers and has to work on single request, however
> > > > more requests have to be involved if we delay the splitting to blk_rq_map_sg().
> > > > Cause splitting means that two bios can't be submitted in single IO request.
> > > 
> > > Of course it's doable, do I have to show you how?
> > 
> > No, you don't have to, could you just point out where my above words is wrong?
> 
> blk_rq_map_sg() _currently_ works on a single request, but as I said from the
> start that this would involve changing it to only process as much of a request
> as would fit on an sglist.

> Drivers will have to be modified, but the changes to driver code should be
> pretty easy. What will be slightly trickier will be changing blk-mq to handle
> requests that are only partially completed; that will be harder than it would
> have been before blk-mq, since the old request queue code used to handle
> partially completed requests - not much work would have to be done that code.

Looks you are suggesting partial request completion.

Then the biggest effect could be in performance, this change will cause the
whole FS bio is handled part by part serially, instead of submitting all
splitted bios(part) concurrently.

So sounds you are suggesting to fix one performance issue by causing new perf
issue, is that doable?


Thanks,
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux