On 10/13/20 6:09 AM, Ming Lei wrote: > On Tue, Oct 13, 2020 at 04:40:51PM +0800, Jeffle Xu wrote: >> Sync polling also needs REQ_NOWAIT flag. One sync read/write may be >> split into several bios (and thus several requests), and can used up the >> queue depth sometimes. Thus the following bio in the same sync >> read/write will wait for usable request if REQ_NOWAIT flag not set, in >> which case the following sync polling will cause a deadlock. >> >> One case (maybe the only case) for above situation is preadv2/pwritev2 >> + direct + highpri. Two conditions need to be satisfied to trigger the >> deadlock. >> >> 1. HIPRI IO in sync routine. Normal read(2)/pread(2)/readv(2)/preadv(2) >> and corresponding write family syscalls don't support high-priority IO and >> thus won't trigger polling routine. Only preadv2(2)/pwritev2(2) supports >> high-priority IO by RWF_HIPRI flag of @flags parameter. >> >> 2. Polling support in sync routine. Currently both the blkdev and >> iomap-based fs (ext4/xfs, etc) support polling in direct IO routine. The >> general routine is described as follows. >> >> submit_bio >> wait for blk_mq_get_tag(), waiting for requests completion, which >> should be done by the following polling, thus causing a deadlock. > > Another blocking point is rq_qos_throttle(), so I guess falling back to > REQ_NOWAIT may not fix the issue completely. > > Given iopoll isn't supposed to in case of big IO, another solution > may be to disable iopoll when bio splitting is needed, something > like the following change: I kind of like that better, especially since polling for split bio's is somewhat of a weird thing. Needs a better comment though, not just on size, but also why multiple bio polling isn't really something that works. -- Jens Axboe