Re: [PATCH v4 1/2] blk-mq: Avoid memory reclaim when allocating request map

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/09/29 22:50, Xiubo Li wrote:
> On 2019/9/30 13:28, Damien Le Moal wrote:
>> On 2019/09/29 18:52, xiubli@xxxxxxxxxx wrote:
>>> From: Xiubo Li <xiubli@xxxxxxxxxx>
>>>
>>> For some storage drivers, such as the nbd, when there has new socket
>>> connections added, it will update the hardware queue number by calling
>>> blk_mq_update_nr_hw_queues(), in which it will freeze all the queues
>>> first. And then tries to do the hardware queue updating stuff.
>>>
>>> But int blk_mq_alloc_rq_map()-->blk_mq_init_tags(), when allocating
>>> memory for tags, it may cause the mm do the memories direct reclaiming,
>>> since the queues has been freezed, so if needs to flush the page cache
>>> to disk, it will stuck in generic_make_request()-->blk_queue_enter() by
>>> waiting the queues to be unfreezed and then cause deadlock here.
>>>
>>> Since the memory size requested here is a small one, which will make
>>> it not that easy to happen with a large size, but in theory this could
>>> happen when the OS is running in pressure and out of memory.
>>>
>>> Gabriel Krisman Bertazi has hit the similar issue by fixing it in
>>> commit 36e1f3d10786 ("blk-mq: Avoid memory reclaim when remapping
>>> queues"), but might forget this part.
>>>
>>> Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
>>> CC: Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxxxxxxx>
>>> Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx>
>>> ---
>>>   block/blk-mq-tag.c | 5 +++--
>>>   block/blk-mq-tag.h | 5 ++++-
>>>   block/blk-mq.c     | 3 ++-
>>>   3 files changed, 9 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
>>> index 008388e82b5c..04ee0e4c3fa1 100644
>>> --- a/block/blk-mq-tag.c
>>> +++ b/block/blk-mq-tag.c
>>> @@ -462,7 +462,8 @@ static struct blk_mq_tags *blk_mq_init_bitmap_tags(struct blk_mq_tags *tags,
>>>   
>>>   struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
>>>   				     unsigned int reserved_tags,
>>> -				     int node, int alloc_policy)
>>> +				     int node, int alloc_policy,
>>> +				     gfp_t flags)
>>>   {
>>>   	struct blk_mq_tags *tags;
>>>   
>>> @@ -471,7 +472,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
>>>   		return NULL;
>>>   	}
>>>   
>>> -	tags = kzalloc_node(sizeof(*tags), GFP_KERNEL, node);
>>> +	tags = kzalloc_node(sizeof(*tags), flags, node);
>>>   	if (!tags)
>>>   		return NULL;
>>>   
>>> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
>>> index 61deab0b5a5a..296e0bc97126 100644
>>> --- a/block/blk-mq-tag.h
>>> +++ b/block/blk-mq-tag.h
>>> @@ -22,7 +22,10 @@ struct blk_mq_tags {
>>>   };
>>>   
>>>   
>>> -extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, unsigned int reserved_tags, int node, int alloc_policy);
>>> +extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
>>> +					    unsigned int reserved_tags,
>>> +					    int node, int alloc_policy,
>>> +					    gfp_t flags);
>>>   extern void blk_mq_free_tags(struct blk_mq_tags *tags);
>>>   
>>>   extern unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data);
>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>> index 240416057f28..9c52e4dfe132 100644
>>> --- a/block/blk-mq.c
>>> +++ b/block/blk-mq.c
>>> @@ -2090,7 +2090,8 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
>>>   		node = set->numa_node;
>>>   
>>>   	tags = blk_mq_init_tags(nr_tags, reserved_tags, node,
>>> -				BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags));
>>> +				BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags),
>>> +				GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY);
>> You added the gfp_t argument to blk_mq_init_tags() but you are only using that
>> argument with a hardcoded value here. So why not simply call kzalloc_node() in
>> that function with the flags GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY ? That
>> would avoid needing to add the "gfp_t flags" argument and still fit with your
>> patch 2 definition of BLK_MQ_GFP_FLAGS.
> 
> The blk_mq_init_tags() is defined in another separate file, which I think it means to provide a common way of initializing the tags stuff, and currently in this path it needs GFP_NOIO while in others in future it may not.

blk_mq_alloc_rq_map() is currently the only user of blk_mq_init_tags(), so I do
not see the point in doing this change now. If it is needed "in the future" then
do it then.

I do not mind the patch going in as is, but I really think that everything can
be folded into your patch 2 without the addition of blk_mq_init_tags()
additional argument.

> 
> Thanks,
> BRs
> 
> 
>>>   	if (!tags)
>>>   		return NULL;
>>>   
>>>
>>
> 
> 


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux