On Fri, Jan 12 2018 at 8:00pm -0500, Bart Van Assche <Bart.VanAssche@xxxxxxx> wrote: > On Fri, 2018-01-12 at 19:52 -0500, Mike Snitzer wrote: > > It was 50 ms before it was 100 ms. No real explaination for these > > values other than they seem to make Bart's IB SRP testbed happy? > > But that constant was not introduced by me in the dm code. No actually it was (not that there's anything wrong with that): commit 06eb061f48594aa369f6e852b352410298b317a8 Author: Bart Van Assche <bart.vanassche@xxxxxxxxxxx> Date: Fri Apr 7 16:50:44 2017 -0700 dm mpath: requeue after a small delay if blk_get_request() fails If blk_get_request() returns ENODEV then multipath_clone_and_map() causes a request to be requeued immediately. This can cause a kworker thread to spend 100% of the CPU time of a single core in __blk_mq_run_hw_queue() and also can cause device removal to never finish. Avoid this by only requeuing after a delay if blk_get_request() fails. Additionally, reduce the requeue delay. Cc: stable@xxxxxxxxxxxxxxx # 4.9+ Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxxxxxx> Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> Note that this commit actually details a different case where a blk_get_request() (in existing code) return of -ENODEV is a very compelling case to use DM_MAPIO_DELAY_REQUEUE. SO I'll revisit what is appropriate in multipath_clone_and_map() on Monday. I need a break... have a good weekend Bart.