On Thu, Oct 22, 2020 at 09:54:53AM +0800, Xiaqing (A) wrote: > > > On 2020/10/17 6:52, Roman Gushchin wrote: > > > This small patchset makes cma_release() non-blocking and simplifies > > the code in hugetlbfs, where previously we had to temporarily drop > > hugetlb_lock around the cma_release() call. > > > > It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic > > THP under a memory pressure requires a cma_release() call. If it's > > a blocking function, it complicates the already complicated code. > > Because there are at least two use cases like this (hugetlbfs is > > another example), I believe it's just better to make cma_release() > > non-blocking. > > > > It also makes it more consistent with other memory releasing functions > > in the kernel: most of them are non-blocking. > > > > > > Roman Gushchin (2): > > mm: cma: make cma_release() non-blocking > > mm: hugetlb: don't drop hugetlb_lock around cma_release() call > > > > mm/cma.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++-- > > mm/hugetlb.c | 6 ------ > > 2 files changed, 49 insertions(+), 8 deletions(-) > > > I don't think this patch is a good idea.It transfers part or even all of the time of > cma_release to cma_alloc, which is more concerned by performance indicators. I'm not quite sure: if cma_alloc() is racing with cma_release(), cma_alloc() will wait for the cma_lock mutex anyway. So we don't really transfer anything to cma_alloc(). > On Android phones, CPU resource competition is intense in many scenarios, > As a result, kernel threads and workers can be scheduled only after some ticks or more. > In this case, the performance of cma_alloc will deteriorate significantly, > which is not good news for many services on Android. Ok, I agree, if the cpu is heavily loaded, it might affect the total execution time. If we aren't going into the mutex->spinlock conversion direction (as Mike suggested), we can address the performance concerns by introducing a cma_release_nowait() function, so that the default cma_release() would work in the old way. cma_release_nowait() can set an atomic flag on a cma area, which will cause following cma_alloc()'s to flush the release queue. In this case there will be no performance penalty unless somebody is using cma_release_nowait(). Will it work for you? Thank you!