Hi all, I've found that changing some block queue settings leads to generic_make_request() latency spike. It is reproducable in linux-4.10 - linux-4.20.6. ~# /usr/share/bcc/tools/funclatency generic_make_request -i 1 -m Tracing 1 functions for "generic_make_request"... Hit Ctrl-C to end. msecs : count distribution 0 -> 1 : 156 |****************************************| msecs : count distribution 0 -> 1 : 153 |****************************************| 2 -> 3 : 0 | | 4 -> 7 : 1 | | msecs : count distribution 0 -> 1 : 162 |****************************************| 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 1 | | There're at least two settings with the same problem: queue/nr_requests queue/scheduler How to reproduce: 1. Run fio: fio --name=test --rw=randwrite --direct=1 --filename=/dev/sdt --iodepth=1 --fsync=1 --ioengine=libaio --write_lat_log loglat 2. While fio is writing to the device, change nr_requests setting: echo 256 > /sys/block/sdt/queue/nr_requests or echo none > /sys/block/sdt/queue/scheduler 3. Stop fio and check slat log: ... 3980, 16, 1, 4096 3999, 17, 1, 4096 4010, 19, 1, 4096 4020, 18, 1, 4096 4045, 10821, 1, 4096 4054, 25, 1, 4096 4070, 12, 1, 4096 4083, 17, 1, 4096 4094, 16, 1, 4096 4111, 17, 1, 4096 ... I've tried to find the reason, using bcc-tools/funclatency. I used fio as in previous example and echo 256 > /sys/block/sdt/queue/nr_requests. It looks like, the problem is caused by waiting in blk_mq_freeze_queue_wait() which is called from blk_freeze_queue(): ~# /usr/share/bcc/tools/funclatency blk_mq_freeze_queue_wait -i 1 -m Tracing 1 functions for "blk_mq_freeze_queue_wait"... Hit Ctrl-C to end. msecs : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 1 |****************************************| At the same time generic_make_request() latency spikes, because it's waiting for freezed request_queue in blk_queue_enter(): wait_event(q->mq_freeze_wq, (atomic_read(&q->mq_freeze_depth) == 0 && (pm || (blk_pm_request_resume(q), !blk_queue_pm_only(q)))) || blk_queue_dying(q)); It could be found whith funclatency too: ~# /usr/share/bcc/tools/funclatency blk_queue_enter -i 1 -m Tracing 1 functions for "blk_queue_enter"... Hit Ctrl-C to end. msecs : count distribution 0 -> 1 : 341 |****************************************| msecs : count distribution 0 -> 1 : 316 |****************************************| msecs : count distribution 0 -> 1 : 255 |****************************************| 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 1 | | Behavior of blk_mq_update_nr_requests() was chaned due to commit 70f36b6001bf596eb411c4b302e84c4824ae8730 Reverting this commit on 4.20.6 fixed described problem with queue/nr_requests. Changing queue/scheduler causes same blk_mq_freeze_queue_wait() latency spike. This affects production environments, beacuse some utilitis (arcconf for example) trigger uevents generating. It leads to settings change due to udev rules for device, so latency spike occurs. Does blk_mq_freeze_queue_wait() waits for disk I/O to be completed? Is it expected behaviour and won't be fixed? If it is expected, could current settings be checked, before freezing queue, or at least before blk_mq_freeze_queue_wait call? It could probably help at least when no changes are needed. -- Regards, Aleksei Zakharov