On Thu, Oct 01, 2020 at 11:02:20AM +0200, Michal Hocko wrote: > On Wed 30-09-20 16:21:54, Paul E. McKenney wrote: > > On Wed, Sep 30, 2020 at 10:41:39AM +0200, Michal Hocko wrote: > > > On Tue 29-09-20 18:53:27, Paul E. McKenney wrote: > [...] > > > > No argument on it being confusing, and I hope that the added header > > > > comment helps. But specifically, can_sleep==true is a promise by the > > > > caller to be schedulable and not to be holding any lock/mutex/whatever > > > > that might possibly be acquired by the memory allocator or by anything > > > > else that the memory allocator might invoke, to your point, including > > > > for but one example the reclaim logic. > > > > > > > > The only way that can_sleep==true is if this function was invoked due > > > > to a call to single-argument kvfree_rcu(), which must be schedulable > > > > because its fallback is to invoke synchronize_rcu(). > > > > > > OK. I have to say that it is still not clear to me whether this call > > > path can be called from the memory reclaim context. If yes then you need > > > __GFP_NOMEMALLOC as well. > > > > Right now the restriction is that single-argument (AKA can_sleep==true) > > kvfree_rcu() cannot be invoked from memory reclaim context. > > > > But would adding __GFP_NOMEMALLOC to the can_sleep==true GFP_ flags > > allow us to remove this restriction? If so, I will queue a separate > > patch making this change. The improved ease of use would be well > > worth it, if I understand correctly (ha!!!). > > It would be quite daring to claim it will be ok but it will certainly be > less problematic. Adding the flag will not hurt in any case. As this is > a shared called that might be called from many contexts I think it will > be safer to have it there. The justification is that it will prevent > consumption of memory reserves from MEMALLOC contexts. > > > > > > [...] > > > > > > > > What is the point of calling kmalloc for a PAGE_SIZE object? Wouldn't > > > > > using the page allocator directly be better? > > > > > > > > Well, you guys gave me considerable heat about abusing internal allocator > > > > interfaces, and kmalloc() and kfree() seem to be about as non-internal > > > > as you can get and still be invoking the allocator. ;-) > > > > > > alloc_pages resp. __get_free_pages is a normal page allocator interface > > > to use for page size granular allocations. kmalloc is for more fine > > > grained allocations. > > > > OK, in the short term, both work, but I have queued a separate patch > > making this change and recording the tradeoffs. This is not yet a > > promise to push this patch, but it is a promise not to lose this part > > of the picture. Please see below. > > It doesn't matter all that much. Both allocators will work. It is just a > matter of using optimal tool for the specific purose. > > > You mentioned alloc_pages(). I reverted to __get_free_pages(), but > > alloc_pages() of course looks nicer. What are the tradeoffs between > > __get_free_pages() and alloc_pages()? > > alloc_pages will return struct page but you need a kernel pointer. That > is what __get_free_pages will give you (or you can call page_address > directly). > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > commit 490b638d7c241ac06cee168ccf8688bb8b872478 > > Author: Paul E. McKenney <paulmck@xxxxxxxxxx> > > Date: Wed Sep 30 16:16:39 2020 -0700 > > > > kvfree_rcu(): Switch from kmalloc/kfree to __get_free_page/free_page. > > > > The advantages of using kmalloc() and kfree() are a possible small speedup > > on CONFIG_SLAB=y systems, avoiding the allocation-side cast, and use of > > more-familiar API members. The advantages of using __get_free_page() > > and free_page() are a possible reduction in fragmentation and direct > > access to the buddy allocator. > > > > To help settle the question as to which to use, this commit switches > > from kmalloc() and kfree() to __get_free_page() and free_page(). > > > > Suggested-by: Michal Hocko <mhocko@xxxxxxxx> > > Suggested-by: "Uladzislau Rezki (Sony)" <urezki@xxxxxxxxx> > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > > Yes, looks good to me. I am not entirely sure about the fragmentation > argument. It really depends on the SL.B allocator internals. The same > applies for the potential speed up. I would be even surprised if the > SLAB was faster in average considering it has to use the page allocator > as well. So to me the primary motivation would be "use the right tool > for the purpose". > As for raised a concern about fragmentation, i mostly was thinking about that SLAbs where not designed to do an efficient allocations for sizes which are >= than PAGE_SIZE. But it depends on three different implementations, actually it also a good argument to switch to the page allocator. I mean to get rid of such dependency. Other side is, SLABs, at least SLAB and SLUB use slab-caches and sizes which they support include up to: <snip> kmalloc-8k 420 420 8192 4 kmalloc-4k 1372 1392 4096 8 8 : tunables 0 0 ... <snip> I would no be surprised that SLAB is faster than using the page allocator in _some_ sense. If it is principle i can double check. I can explain it just in having dynamic caching that can grow based on demand, thus there is no need to go deeper to page allocator if the kmalloc-4k has extra objects. But the worst case of course will be slower :) -- Vlad Rezki