Re: [PATCH v7] block: disable iopoll for split bio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/26/20 12:04 AM, Christoph Hellwig wrote:
> 
> On Wed, Nov 25, 2020 at 09:43:35PM +0800, Jeffle Xu wrote:
>> iopoll is initially for small size, latency sensitive IO. It doesn't
>> work well for big IO, especially when it needs to be split to multiple
>> bios. In this case, the returned cookie of __submit_bio_noacct_mq() is
>> indeed the cookie of the last split bio. The completion of *this* last
>> split bio done by iopoll doesn't mean the whole original bio has
>> completed. Callers of iopoll still need to wait for completion of other
>> split bios.
>>
>> Besides bio splitting may cause more trouble for iopoll which isn't
>> supposed to be used in case of big IO.
>>
>> iopoll for split bio may cause potential race if CPU migration happens
>> during bio submission. Since the returned cookie is that of the last
>> split bio, polling on the corresponding hardware queue doesn't help
>> complete other split bios, if these split bios are enqueued into
>> different hardware queues. Since interrupts are disabled for polling
>> queues, the completion of these other split bios depends on timeout
>> mechanism, thus causing a potential hang.
>>
>> iopoll for split bio may also cause hang for sync polling. Currently
>> both the blkdev and iomap-based fs (ext4/xfs, etc) support sync polling
>> in direct IO routine. These routines will submit bio without REQ_NOWAIT
>> flag set, and then start sync polling in current process context. The
>> process may hang in blk_mq_get_tag() if the submitted bio has to be
>> split into multiple bios and can rapidly exhaust the queue depth. The
>> process are waiting for the completion of the previously allocated
>> requests, which should be reaped by the following polling, and thus
>> causing a deadlock.
>>
>> To avoid these subtle trouble described above, just disable iopoll for
>> split bio and return BLK_QC_T_NONE in this case. The side effect is that
>> non-HIPRI IO also returns BLK_QC_T_NONE now. It should be acceptable
>> since the returned cookie is never used for non-HIPRI IO.
>>
>> Suggested-by: Ming Lei <ming.lei@xxxxxxxxxx>
>> Signed-off-by: Jeffle Xu <jefflexu@xxxxxxxxxxxxxxxxx>
>> ---
>>  block/blk-merge.c | 8 ++++++++
>>  block/blk-mq.c    | 6 +++++-
>>  2 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-merge.c b/block/blk-merge.c
>> index bcf5e4580603..8a2f1fb7bb16 100644
>> --- a/block/blk-merge.c
>> +++ b/block/blk-merge.c
>> @@ -279,6 +279,14 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>>  	return NULL;
>>  split:
>>  	*segs = nsegs;
>> +
>> +	/*
>> +	 * Bio splitting may cause subtle trouble such as hang when doing sync
>> +	 * iopoll in direct IO routine. Given performance gain of iopoll for
>> +	 * big IO can be trival, disable iopoll when split needed.
>> +	 */
>> +	bio->bi_opf &= ~REQ_HIPRI;
>> +
>>  	return bio_split(bio, sectors, GFP_NOIO, bs);
>>  }
>>  
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 55bcee5dc032..4acd702575cd 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -2265,7 +2265,11 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio)
>>  		blk_mq_sched_insert_request(rq, false, true, true);
>>  	}
>>  
>> -	return cookie;
>> +	if (bio->bi_opf & REQ_HIPRI)
>> +		return cookie;
>> +
>> +	return BLK_QC_T_NONE;
> 
> This can cause a use after free if the I/O completed very quickly.  I
> think we need to check the flag after the split, but before submitting
> the request to the hardware.
> 
Oops... Thanks, I'll send a new version later.

-- 
Thanks,
Jeffle



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux