Re: [PATCH v6] block: disable iopoll for split bio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/25/20 4:19 PM, Ming Lei wrote:
> On Wed, Nov 25, 2020 at 04:05:10PM +0800, JeffleXu wrote:
>>
>>
>> On 11/25/20 3:19 PM, Ming Lei wrote:
>>> On Wed, Nov 25, 2020 at 02:41:47PM +0800, Jeffle Xu wrote:
>>>> iopoll is initially for small size, latency sensitive IO. It doesn't
>>>> work well for big IO, especially when it needs to be split to multiple
>>>> bios. In this case, the returned cookie of __submit_bio_noacct_mq() is
>>>> indeed the cookie of the last split bio. The completion of *this* last
>>>> split bio done by iopoll doesn't mean the whole original bio has
>>>> completed. Callers of iopoll still need to wait for completion of other
>>>> split bios.
>>>>
>>>> Besides bio splitting may cause more trouble for iopoll which isn't
>>>> supposed to be used in case of big IO.
>>>>
>>>> iopoll for split bio may cause potential race if CPU migration happens
>>>> during bio submission. Since the returned cookie is that of the last
>>>> split bio, polling on the corresponding hardware queue doesn't help
>>>> complete other split bios, if these split bios are enqueued into
>>>> different hardware queues. Since interrupts are disabled for polling
>>>> queues, the completion of these other split bios depends on timeout
>>>> mechanism, thus causing a potential hang.
>>>>
>>>> iopoll for split bio may also cause hang for sync polling. Currently
>>>> both the blkdev and iomap-based fs (ext4/xfs, etc) support sync polling
>>>> in direct IO routine. These routines will submit bio without REQ_NOWAIT
>>>> flag set, and then start sync polling in current process context. The
>>>> process may hang in blk_mq_get_tag() if the submitted bio has to be
>>>> split into multiple bios and can rapidly exhaust the queue depth. The
>>>> process are waiting for the completion of the previously allocated
>>>> requests, which should be reaped by the following polling, and thus
>>>> causing a deadlock.
>>>>
>>>> To avoid these subtle trouble described above, just disable iopoll for
>>>> split bio.
>>>>
>>>> Suggested-by: Ming Lei <ming.lei@xxxxxxxxxx>
>>>> Signed-off-by: Jeffle Xu <jefflexu@xxxxxxxxxxxxxxxxx>
>>>> Reviewed-by: Christoph Hellwig <hch@xxxxxx>
>>>> ---
>>>>  block/bio.c               |  2 ++
>>>>  block/blk-merge.c         | 12 ++++++++++++
>>>>  block/blk-mq.c            |  3 +++
>>>>  include/linux/blk_types.h |  1 +
>>>>  4 files changed, 18 insertions(+)
>>>>
>>>> diff --git a/block/bio.c b/block/bio.c
>>>> index fa01bef35bb1..7f7ddc22a30d 100644
>>>> --- a/block/bio.c
>>>> +++ b/block/bio.c
>>>> @@ -684,6 +684,8 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src)
>>>>  	bio_set_flag(bio, BIO_CLONED);
>>>>  	if (bio_flagged(bio_src, BIO_THROTTLED))
>>>>  		bio_set_flag(bio, BIO_THROTTLED);
>>>> +	if (bio_flagged(bio_src, BIO_SPLIT))
>>>> +		bio_set_flag(bio, BIO_SPLIT);
>>>>  	bio->bi_opf = bio_src->bi_opf;
>>>>  	bio->bi_ioprio = bio_src->bi_ioprio;
>>>>  	bio->bi_write_hint = bio_src->bi_write_hint;
>>>> diff --git a/block/blk-merge.c b/block/blk-merge.c
>>>> index bcf5e4580603..a2890cebf99f 100644
>>>> --- a/block/blk-merge.c
>>>> +++ b/block/blk-merge.c
>>>> @@ -279,6 +279,18 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>>>>  	return NULL;
>>>>  split:
>>>>  	*segs = nsegs;
>>>> +
>>>> +	/*
>>>> +	 * Bio splitting may cause subtle trouble such as hang when doing sync
>>>> +	 * iopoll in direct IO routine. Given performance gain of iopoll for
>>>> +	 * big IO can be trival, disable iopoll when split needed. We need
>>>> +	 * BIO_SPLIT to identify bios need this workaround. Since currently
>>>> +	 * only normal IO under mq routine may suffer this issue, BIO_SPLIT is
>>>> +	 * only marked here.
>>>> +	 */
>>>> +	bio->bi_opf &= ~REQ_HIPRI;
>>>> +	bio_set_flag(bio, BIO_SPLIT);
>>>> +
>>>>  	return bio_split(bio, sectors, GFP_NOIO, bs);
>>>>  }
>>>>  
>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>>> index 55bcee5dc032..ce1f3628e4c2 100644
>>>> --- a/block/blk-mq.c
>>>> +++ b/block/blk-mq.c
>>>> @@ -2265,6 +2265,9 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio)
>>>>  		blk_mq_sched_insert_request(rq, false, true, true);
>>>>  	}
>>>>  
>>>> +	if (bio_flagged(bio, BIO_SPLIT))
>>>> +		return BLK_QC_T_NONE;
>>>> +
>>>
>>> Not sure the new bio flag is really required for this case, just wondering
>>> why not take the following simple way? BTW we are really going to run
>>> out of bio flag.
>>>
>>
>> Please consider the following case:
>>
>> One big bio got split into two split bios. At the first call of
>> blk_mq_submit_bio(), the input @bio (actually the original big bio)
>> indeed gets split. The split bio gets enqueued to hw queue and the
>> returned cookie is BLK_QC_T_NONE, while the remained bio gets buffered
>> in bio_list. So far so good.
> 
> When this bio gets splitted, REQ_HIPRI is cleared for this bio, and
> all splitted bios won't set this flag too.
> 
That's true, at least in terms of my patch.

>>
>> Then when calling blk_mq_submit_bio() the second time, the input @bio is
>> indeed the remained bio. At this time, it will not get split and you
>> will get a *valid* cookie. And since the cookie of last split bio will
>> actually overrides the previous cookie, you will get a *valid* cookie as
>> a result.
> 
> Then valid cookie can be returned only for bio with REQ_HIPRI.
Sounds reasonable. The side effect is that original IO without REQ_HIPRI
now also returns BLK_QC_T_NONE. But it may not be a big problem, since
the returned cookie is not used at all for IO without REQ_HIPRI.

-- 
Thanks,
Jeffle



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux