Re: [PATCH v3 1/4] md: use memalloc scope APIs in mddev_suspend()/mddev_resume()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020/4/9 11:05 下午, Michal Hocko wrote:
> On Thu 09-04-20 22:17:20, colyli@xxxxxxx wrote:
>> From: Coly Li <colyli@xxxxxxx>
>>
>> In raid5.c:resize_chunk(), scribble_alloc() is called with GFP_NOIO
>> flag, then it is sent into kvmalloc_array() inside scribble_alloc().
>>
>> The problem is kvmalloc_array() eventually calls kvmalloc_node() which
>> does not accept non GFP_KERNEL compatible flag like GFP_NOIO, then
>> kmalloc_node() is called indeed to allocate physically continuous
>> pages. When system memory is under heavy pressure, and the requesting
>> size is large, there is high probability that allocating continueous
>> pages will fail.
>>
>> But simply using GFP_KERNEL flag to call kvmalloc_array() is also
>> progblematic. In the code path where scribble_alloc() is called, the
>> raid array is suspended, if kvmalloc_node() triggers memory reclaim I/Os
>> and such I/Os go back to the suspend raid array, deadlock will happen.
>>
>> What is desired here is to allocate non-physically (a.k.a virtually)
>> continuous pages and avoid memory reclaim I/Os. Michal Hocko suggests
>> to use the mmealloc sceope APIs to restrict memory reclaim I/O in
>> allocating context, specifically to call memalloc_noio_save() when
>> suspend the raid array and to call memalloc_noio_restore() when
>> resume the raid array.
>>
>> This patch adds the memalloc scope APIs in mddev_suspend() and
>> mddev_resume(), to restrict memory reclaim I/Os during the raid array
>> is suspended. The benifit of adding the memalloc scope API in the
>> unified entry point mddev_suspend()/mddev_resume() is, no matter which
>> md raid array type (personality), we are sure the deadlock by recursive
>> memory reclaim I/O won't happen on the suspending context.
> 
> I am not familiar with the mdraid code so I cannot really judge the
> correctness here but if mddev_suspend really acts as a potential reclaim
> recursion deadlock entry then this is the right way to use the API.
> Essentially all the allocations in that scope will have an implicit NOIO
> semantic.
> 
> Thing to be careful about is the make sure that mddev_suspend cannot
> be nested. And also that there are no callers of scribble_alloc outside
> of mddev_suspend scope which would be reclaim deadlock prone. If they
> are their scope should be handled in the similar way.

Thank you for the confirmation, and the always constructive discussion :-)

-- 

Coly Li



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux