On 1/11/19 5:55 AM, Gaël PORTAY wrote: > On Mon, Jan 07, 2019 at 06:06:21PM -0800, Mike Kravetz wrote: >> On 1/7/19 10:13 AM, Gaël PORTAY wrote: >>> (...) >>> I have also removed the mutex (start_isolate_page_range retunrs -EBUSY), >>> and it worked (in my case). >>> >>> But I did not do the proper magic because I am not sure of what should >>> be done and how: -EBUSY is not handled and __GFP_NOIO is not honored. >> >> If we remove the mutex, I am pretty sure we would want to distinguish >> between the (at least) two types of _EBUSY that can be returned by >> alloc_contig_range(). Seems that the retry logic should be different if >> a page block is busy as opposed to pages within the range. Hello Gael, I spent some time looking into removing cma_mutex. My initial statement that it is no longer absolutely necessary is correct. This is because the underlying code will not allow two threads to work on same pageblock. However, removing the mutex causes issues. If we remove the mutex, then alloc_contig_range() will return -EBUSY if another thread is operating on any of the pageblocks in the range. We could make alloc_contig_range distinguish between this and the existing -EBUSY case caused by the inability to migrate pages in the range. Even if we do this, we need to determine what should happen if alloc_contig_range fails for this reason. I can only think of a few options: 1) Immediately fail the cma_alloc request. The problem with doing this is that it would cause a regression. Currently, two threads can call cma_alloc with ranges that touch the same pageblock. With the mutex in place, the threads will wait for each other to finish with the pageblock. Without it, we return an error to one caller where it previously would have succeeded. So, this approach is unacceptable. 2) Notice this new condition and 'retry'. In theory this sounds possible. However, we do not know how long another thread will keep the pageblock(s) in question isolated. It could be quite a long time depending on the what the other thread is trying to do. We really do not want to go into a tight loop retrying. Adding 'delays' between retrys seems like the wrong thing to do for a memory allocator. So, I think this option is out. 3) In some cases, it might be possible to 'intelligently retry' with a range that does not touch busy pageblocks. However, this is highly dependent on the cma area and size of allocation. It may not always be possible and is not a suitable option. Bottom line is that I can not think of a good way to remove cma_mutex without possibly introducing some other issue. It appears that you have already found a workaround for the issue you were seeing. Is this correct? If so, I suggest we not go down the path of trying to eliminate cma_mutex right now. Or, perhaps someone else has other suggestions. -- Mike Kravetz