On Tue, Nov 17, 2020 at 03:56:24PM +0800, Jeffle Xu wrote: > iopoll is initially for small size, latency sensitive IO. It doesn't > work well for big IO, especially when it needs to be split to multiple > bios. In this case, the returned cookie of __submit_bio_noacct_mq() is > indeed the cookie of the last split bio. The completion of *this* last > split bio done by iopoll doesn't mean the whole original bio has > completed. Callers of iopoll still need to wait for completion of other > split bios. > > Besides bio splitting may cause more trouble for iopoll which isn't > supposed to be used in case of big IO. > > iopoll for split bio may cause potential race if CPU migration happens > during bio submission. Since the returned cookie is that of the last > split bio, polling on the corresponding hardware queue doesn't help > complete other split bios, if these split bios are enqueued into > different hardware queues. Since interrupts are disabled for polling > queues, the completion of these other split bios depends on timeout > mechanism, thus causing a potential hang. > > iopoll for split bio may also cause hang for sync polling. Currently > both the blkdev and iomap-based fs (ext4/xfs, etc) support sync polling > in direct IO routine. These routines will submit bio without REQ_NOWAIT > flag set, and then start sync polling in current process context. The > process may hang in blk_mq_get_tag() if the submitted bio has to be > split into multiple bios and can rapidly exhaust the queue depth. The > process are waiting for the completion of the previously allocated > requests, which should be reaped by the following polling, and thus > causing a deadlock. > > To avoid these subtle trouble described above, just disable iopoll for > split bio. > > Suggested-by: Ming Lei <ming.lei@xxxxxxxxxx> > Signed-off-by: Jeffle Xu <jefflexu@xxxxxxxxxxxxxxxxx> > --- > block/blk-merge.c | 7 +++++++ > block/blk-mq.c | 6 ++++-- > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/block/blk-merge.c b/block/blk-merge.c > index bcf5e4580603..53ad781917a2 100644 > --- a/block/blk-merge.c > +++ b/block/blk-merge.c > @@ -279,6 +279,13 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, > return NULL; > split: > *segs = nsegs; > + > + /* > + * bio splitting may cause subtle trouble such as hang when doing iopoll, Please capitalize the first character of a multi-line comments. Also this adds an overly long line. > + hctx = q->queue_hw_ctx[blk_qc_t_to_queue_num(cookie)]; > + if (hctx->type != HCTX_TYPE_POLL) > + return 0; I think this is good as a sanity check, but shouldn't we be able to avoid even hitting this patch if we ensure that BLK_QC_T_NONE is returned after a bio is split?