block devices are refcounted so to ensure once its final user goes away it can be cleaned up by the lower layers properly. The block device's request_queue structure is also refcounted, however, if the last blk_put_queue() is called under atomic context the block layer has to defer removal. By refcounting the block device during the use of blkcg_schedule_throttle(), we ensure ensure two things: 1) the block device remains available during the call 2) we ensure avoid having to deal with the fact we're using the request_queue structure in atomic context, since the last blk_put_queue() will be called upon disk_release(), *after* our own bdput(). This means this code path is *not* going to remove the request_queue structure, as we are ensuring some later upper layer disk_release() will be the one to release the request_queue structure for us. Cc: Bart Van Assche <bvanassche@xxxxxxx> Cc: Omar Sandoval <osandov@xxxxxx> Cc: Hannes Reinecke <hare@xxxxxxxx> Cc: Nicolai Stange <nstange@xxxxxxx> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: yu kuai <yukuai3@xxxxxxxxxx> Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> --- mm/swapfile.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index 6659ab563448..9285ff6030ca 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3753,6 +3753,7 @@ static void free_swap_count_continuations(struct swap_info_struct *si) void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg, int node, gfp_t gfp_mask) { + struct block_device *bdev; struct swap_info_struct *si, *next; if (!(gfp_mask & __GFP_IO) || !memcg) return; @@ -3771,8 +3772,17 @@ void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg, int node, plist_for_each_entry_safe(si, next, &swap_avail_heads[node], avail_lists[node]) { if (si->bdev) { - blkcg_schedule_throttle(bdev_get_queue(si->bdev), - true); + bdev = bdgrab(si->bdev); + if (!bdev) + continue; + /* + * By adding our own bdgrab() we ensure the queue + * sticks around until disk_release(), and so we ensure + * our release of the request_queue does not happen in + * atomic context. + */ + blkcg_schedule_throttle(bdev_get_queue(bdev), true); + bdput(bdev); break; } } -- 2.25.1