This versions tests out solid, and we're still seeing the same improvements. Changes: - Lock map index for the move. This eliminates the race completely, since it's now not possible to find ->cleared == 0 while swap of bits is in progress. The previous version was fine for users that re-check after having added themselves to the waitqueue, but we have users that don't re-check after getting failure. This works for both. - Add the new states to the blk-mq debugfs output. - Wrap the waitqueue in a sbitmap waitqueue, so we can ensure that we account it properly. This means that any kind of prep+finish on the waitqueue will work fine, just like before. -- Jens Axboe