On Fri, Aug 14, 2020 at 11:52:06PM +0200, Peter Zijlstra wrote: > On Fri, Aug 14, 2020 at 01:41:40PM -0700, Paul E. McKenney wrote: > > > And that enforces the GFP_NOLOCK allocation mode or some other solution > > > unless you make a new rule that calling call_rcu() is forbidden while > > > holding zone lock or any other lock which might be nested inside the > > > GFP_NOWAIT zone::lock held region. > > > > Again, you are correct. Maybe the forecasted weekend heat will cause > > my brain to hallucinate a better solution, but in the meantime, the > > GFP_NOLOCK approach looks good from this end. > > So I hate __GFP_NO_LOCKS for a whole number of reasons: > > - it should be called __GFP_LOCKLESS if anything > - it sprinkles a bunch of ugly branches around the allocator fast path > - it only works for order==0 > > Combined I really odn't think this should be a GFP flag. How about a > special purpose allocation function, something like so.. This looks entirely reasonable to me! Thanx, Paul > --- > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 901a21f61d68..cdec9c99fba7 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4875,6 +4875,47 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid, > } > EXPORT_SYMBOL(__alloc_pages_nodemask); > > +struct page *__rmqueue_lockless(struct zone *zone, struct per_cpu_pages *pcp) > +{ > + struct list_head *list; > + struct page *page; > + int migratetype; > + > + for (migratetype = 0; migratetype < MIGRATE_PCPTYPES; migratetype++) { > + list = &pcp->list[migratetype]; > + page = list_first_entry_or_null(list, struct page, lru); > + if (page && check_new_pcp(page)) { > + list_del(&page->lru); > + pcp->count--; > + return page; > + } > + } > + > + return NULL; > +} > + > +struct page *__alloc_page_lockless(void) > +{ > + struct zonelist *zonelist = node_zonelist(numa_node_id(), GFP_KERNEL); > + struct per_cpu_pages *pcp; > + struct page *page = NULL; > + unsigned long flags; > + struct zoneref *z; > + struct zone *zone; > + > + for_each_zone_zonelist(zone, z, zonelist, ZONE_NORMAL) { > + local_irq_save(flags); > + pcp = &this_cpu_ptr(zone->pageset)->pcp; > + page = __rmqueue_lockless(zone, pcp); > + local_irq_restore(flags); > + > + if (page) > + break; > + } > + > + return page; > +} > + > /* > * Common helper functions. Never use with __GFP_HIGHMEM because the returned > * address cannot represent highmem pages. Use alloc_pages and then kmap if