On 11/6/18 9:47 AM, Aaron Lu wrote: > On Tue, Nov 06, 2018 at 09:16:20AM +0100, Vlastimil Babka wrote: >> On 11/6/18 6:30 AM, Aaron Lu wrote: >>> We have multiple places of freeing a page, most of them doing similar >>> things and a common function can be used to reduce code duplicate. >>> >>> It also avoids bug fixed in one function but left in another. >>> >>> Signed-off-by: Aaron Lu <aaron.lu@xxxxxxxxx> >> >> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > > Thanks. > >> I assume there's no arch that would run page_ref_sub_and_test(1) slower >> than put_page_testzero(), for the critical __free_pages() case? > > Good question. > > I followed the non-arch specific calls and found that: > page_ref_sub_and_test() ends up calling atomic_sub_return(i, v) while > put_page_testzero() ends up calling atomic_sub_return(1, v). So they > should be same for archs that do not have their own implementations. x86 seems to distinguish between DECL and SUBL, see arch/x86/include/asm/atomic.h although I could not figure out where does e.g. arch_atomic_dec_and_test become atomic_dec_and_test to override the generic implementation. I don't know if the CPU e.g. executes DECL faster, but objectively it has one parameter less. Maybe it doesn't matter? > Back to your question: I don't know either. > If this is deemed unsafe, we can probably keep the ref modify part in > their original functions and only take the free part into a common > function. I guess you could also employ if (__builtin_constant_p(nr)) in free_the_page(), but the result will be ugly I guess, and maybe not worth it :) > Regards, > Aaron > >>> --- >>> v2: move comments close to code as suggested by Dave. >>> >>> mm/page_alloc.c | 36 ++++++++++++++++-------------------- >>> 1 file changed, 16 insertions(+), 20 deletions(-) >>> >>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>> index 91a9a6af41a2..4faf6b7bf225 100644 >>> --- a/mm/page_alloc.c >>> +++ b/mm/page_alloc.c >>> @@ -4425,9 +4425,17 @@ unsigned long get_zeroed_page(gfp_t gfp_mask) >>> } >>> EXPORT_SYMBOL(get_zeroed_page); >>> >>> -void __free_pages(struct page *page, unsigned int order) >>> +static inline void free_the_page(struct page *page, unsigned int order, int nr) >>> { >>> - if (put_page_testzero(page)) { >>> + VM_BUG_ON_PAGE(page_ref_count(page) == 0, page); >>> + >>> + /* >>> + * Free a page by reducing its ref count by @nr. >>> + * If its refcount reaches 0, then according to its order: >>> + * order0: send to PCP; >>> + * high order: directly send to Buddy. >>> + */ >>> + if (page_ref_sub_and_test(page, nr)) { >>> if (order == 0) >>> free_unref_page(page); >>> else >>> @@ -4435,6 +4443,10 @@ void __free_pages(struct page *page, unsigned int order) >>> } >>> } >>> >>> +void __free_pages(struct page *page, unsigned int order) >>> +{ >>> + free_the_page(page, order, 1); >>> +} >>> EXPORT_SYMBOL(__free_pages); >>> >>> void free_pages(unsigned long addr, unsigned int order) >>> @@ -4481,16 +4493,7 @@ static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, >>> >>> void __page_frag_cache_drain(struct page *page, unsigned int count) >>> { >>> - VM_BUG_ON_PAGE(page_ref_count(page) == 0, page); >>> - >>> - if (page_ref_sub_and_test(page, count)) { >>> - unsigned int order = compound_order(page); >>> - >>> - if (order == 0) >>> - free_unref_page(page); >>> - else >>> - __free_pages_ok(page, order); >>> - } >>> + free_the_page(page, compound_order(page), count); >>> } >>> EXPORT_SYMBOL(__page_frag_cache_drain); >>> >>> @@ -4555,14 +4558,7 @@ void page_frag_free(void *addr) >>> { >>> struct page *page = virt_to_head_page(addr); >>> >>> - if (unlikely(put_page_testzero(page))) { >>> - unsigned int order = compound_order(page); >>> - >>> - if (order == 0) >>> - free_unref_page(page); >>> - else >>> - __free_pages_ok(page, order); >>> - } >>> + free_the_page(page, compound_order(page), 1); >>> } >>> EXPORT_SYMBOL(page_frag_free); >>> >>> >>