On Wed, Dec 02, 2020 at 09:22:36PM +0100, David Hildenbrand wrote: > On 02.12.20 20:26, Minchan Kim wrote: > > On Wed, Dec 02, 2020 at 07:51:07PM +0100, Michal Hocko wrote: > >> On Wed 02-12-20 09:54:29, Minchan Kim wrote: > >>> On Wed, Dec 02, 2020 at 05:48:34PM +0100, Michal Hocko wrote: > >>>> On Wed 02-12-20 08:15:49, Minchan Kim wrote: > >>>>> On Wed, Dec 02, 2020 at 04:49:15PM +0100, Michal Hocko wrote: > >>>> [...] > >>>>>> Well, what I can see is that this new interface is an antipatern to our > >>>>>> allocation routines. We tend to control allocations by gfp mask yet you > >>>>>> are introducing a bool parameter to make something faster... What that > >>>>>> really means is rather arbitrary. Would it make more sense to teach > >>>>>> cma_alloc resp. alloc_contig_range to recognize GFP_NOWAIT, GFP_NORETRY resp. > >>>>>> GFP_RETRY_MAYFAIL instead? > >>>>> > >>>>> If we use cma_alloc, that interface requires "allocate one big memory > >>>>> chunk". IOW, return value is just struct page and expected that the page > >>>>> is a big contiguos memory. That means it couldn't have a hole in the > >>>>> range. > >>>>> However the idea here, what we asked is much smaller chunk rather > >>>>> than a big contiguous memory so we could skip some of pages if they are > >>>>> randomly pinned(long-term/short-term whatever) and search other pages > >>>>> in the CMA area to avoid long stall. Thus, it couldn't work with exising > >>>>> cma_alloc API with simple gfp_mak. > >>>> > >>>> I really do not see that as something really alient to the cma_alloc > >>>> interface. All you should care about, really, is what size of the object > >>>> you want and how hard the system should try. If you have a problem with > >>>> an internal implementation of CMA and how it chooses a range and deal > >>>> with pinned pages then it should be addressed inside the CMA allocator. > >>>> I suspect that you are effectivelly trying to workaround those problems > >>>> by a side implementation with a slightly different API. Or maybe I still > >>>> do not follow the actual problem. > >>>> > >>>>>> I am not deeply familiar with the cma allocator so sorry for a > >>>>>> potentially stupid question. Why does a bulk interface performs better > >>>>>> than repeated calls to cma_alloc? Is this because a failure would help > >>>>>> to move on to the next pfn range while a repeated call would have to > >>>>>> deal with the same range? > >>>>> > >>>>> Yub, true with other overheads(e.g., migration retrial, waiting writeback > >>>>> PCP/LRU draining IPI) > >>>> > >>>> Why cannot this be implemented in the cma_alloc layer? I mean you can > >>>> cache failed cases and optimize the proper pfn range search. > >>> > >>> So do you suggest this? > >>> > >>> enum cma_alloc_mode { > >>> CMA_ALLOC_NORMAL, > >>> CMA_ALLOC_FAIL_FAST, > >>> }; > >>> > >>> struct page *cma_alloc(struct cma *cma, size_t count, unsigned int > >>> align, enum cma_alloc_mode mode); > >>> > >>> >From now on, cma_alloc will keep last failed pfn and then start to > >>> search from the next pfn for both CMA_ALLOC_NORMAL and > >>> CMA_ALLOC_FAIL_FAST if requested size from the cached pfn is okay > >>> within CMA area and then wraparound it couldn't find right pages > >>> from the cached pfn. Othewise, the cached pfn will reset to the zero > >>> so that it starts the search from the 0. I like the idea since it's > >>> general improvement, I think. > >> > >> Yes something like that. There are more options to be clever here - e.g. > >> track ranges etc. but I am not sure this is worth the complexity. > > > > Agree. Just last pfn caching would be good enough as simple start. > > > >> > >>> Furthemore, With CMA_ALLOC_FAIL_FAST, it could avoid several overheads > >>> at the cost of sacrificing allocation success ratio like GFP_NORETRY. > >> > >> I am still not sure a specific flag is a good interface. Really can this > >> be gfp_mask instead? > > > > I am not strong(even, I did it with GFP_NORETRY) but David wanted to > > have special mode and I agreed when he mentioned ALLOC_CONTIG_HARD as > > one of options in future(it would be hard to indicate that mode with > > gfp flags). > > I can't tell regarding the CMA interface, but for the alloc_contig() > interface I think modes make sense. Yes, it's different to other > allocaters, but the contig range allocater is different already. E.g., > the CMA allocater mostly hides "which exact PFNs you try to allocate". > > In the contig range allocater, gfp flags are currently used to express > how to allocate pages used as migration targets. I don't think mangling > in other gfp flags (or even overloading them) makes things a lot > clearer. E.g., GFP_NORETRY: don't retry to allocate migration targets? > don't retry to migrate pages? both? > > As I said, other aspects might be harder to model (e.g., don't drain > LRU) and hiding them behind generic gfp flags (e.g., GFP_NORETRY) feels > wrong. I also support a special flag/bool variable for cma_alloc rather than relying on mixing original gfp_flags since it would be more clear with preventing passing unhandled the other gfp_flags into cma_alloc. > > With the mode, we're expressing details for the necessary page > migration. Suggestions on how to model that are welcome. > > -- > Thanks, > > David / dhildenb >