On Mon 28-09-20 16:31:01, paulmck@xxxxxxxxxx wrote: [...] > This commit therefore uses preemptible() to determine whether allocation > is possible at all for double-argument kvfree_rcu(). This deserves a comment. Because GFP_ATOMIC is possible for many !preemptible() contexts. It is the raw_spin_lock, NMIs and likely few others that are a problem. You are taking a conservative approach which is fine but it would be good to articulate that explicitly. > If !preemptible(), > then allocation is not possible, and kvfree_rcu() falls back to using > the less cache-friendly rcu_head approach. Even when preemptible(), > the caller might be involved in reclaim, so the GFP_ flags used by > double-argument kvfree_rcu() must avoid invoking reclaim processing. Could you be more specific? Is this about being called directly in the reclaim context and you want to prevent a recursion? If that is the case, do you really need to special case this in any way? Any memory reclaim will set PF_MEMALLOC so allocations called from that context will not perform reclaim. So if you are called from the reclaim directly then you might want to do GFP_KERNEL | __GFP_NOMEMALLOC | __GFP_NOWARN. That should handle both from-the-recalim and outside of reclaim contexts just fine (assuming you don't allocated from !preemptible()) context. > Note that single-argument kvfree_rcu() must be invoked in sleepable > contexts, and that its fallback is the relatively high latency > synchronize_rcu(). Single-argument kvfree_rcu() therefore uses > GFP_KERNEL|__GFP_RETRY_MAYFAIL to allow limited sleeping within the > memory allocator. [...] > static inline bool > -kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) > +add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp, > + unsigned long *flags, void *ptr, bool can_sleep) > { > struct kvfree_rcu_bulk_data *bnode; > + bool can_alloc_page = preemptible(); > + gfp_t gfp = (can_sleep ? GFP_KERNEL | __GFP_RETRY_MAYFAIL : GFP_ATOMIC) | __GFP_NOWARN; This is quite confusing IMHO. At least without a further explanation. can_sleep is not as much about sleeping as it is about the reclaim recursion AFAIU your changelog, right? > int idx; > > - if (unlikely(!krcp->initialized)) > + *krcp = krc_this_cpu_lock(flags); > + if (unlikely(!(*krcp)->initialized)) > return false; > > - lockdep_assert_held(&krcp->lock); > idx = !!is_vmalloc_addr(ptr); > > /* Check if a new block is required. */ > - if (!krcp->bkvhead[idx] || > - krcp->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) { > - bnode = get_cached_bnode(krcp); > - if (!bnode) { > - /* > - * To keep this path working on raw non-preemptible > - * sections, prevent the optional entry into the > - * allocator as it uses sleeping locks. In fact, even > - * if the caller of kfree_rcu() is preemptible, this > - * path still is not, as krcp->lock is a raw spinlock. > - * With additional page pre-allocation in the works, > - * hitting this return is going to be much less likely. > - */ > - if (IS_ENABLED(CONFIG_PREEMPT_RT)) > - return false; > - > - /* > - * NOTE: For one argument of kvfree_rcu() we can > - * drop the lock and get the page in sleepable > - * context. That would allow to maintain an array > - * for the CONFIG_PREEMPT_RT as well if no cached > - * pages are available. > - */ > - bnode = (struct kvfree_rcu_bulk_data *) > - __get_free_page(GFP_NOWAIT | __GFP_NOWARN); > + if (!(*krcp)->bkvhead[idx] || > + (*krcp)->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) { > + bnode = get_cached_bnode(*krcp); > + if (!bnode && can_alloc_page) { > + krc_this_cpu_unlock(*krcp, *flags); > + bnode = kmalloc(PAGE_SIZE, gfp); What is the point of calling kmalloc for a PAGE_SIZE object? Wouldn't using the page allocator directly be better? -- Michal Hocko SUSE Labs