The page allocator's per-cpu page lists (pcplists) are currently protected using local_locks. While performance savvy, this doesn't allow for remote access to these structures. CPUs requiring system-wide changes to the per-cpu lists get around this by scheduling workers on each CPU. That said, some setups like NOHZ_FULL CPUs, aren't well suited to this since they can't handle interruptions of any sort. To mitigate this, replace the current draining mechanism with one that allows remotely draining the lists: - Each CPU now has two pcplists pointers: one that points to a pcplists instance that is in-use, 'pcp->lp', another that points to an idle and empty instance, 'pcp->drain'. CPUs access their local pcplists through 'pcp->lp' and the pointer is dereferenced atomically. - When a CPU decides it needs to empty some remote pcplists, it'll atomically exchange the remote CPU's 'pcp->lp' and 'pcp->drain' pointers. A remote CPU racing with this will either have: - An old 'pcp->lp' reference, it'll soon be emptied by the drain process, we just have to wait for it to finish using it. - The new 'pcp->lp' reference, that is, an empty pcplists instance. rcu_replace_pointer()'s release semantics ensures any prior changes will be visible by the remote CPU, for example: changes to 'pcp->high' and 'pcp->batch' when disabling the pcplists. - The CPU that started the drain can now wait for an RCU grace period to make sure the remote CPU is done using the old pcplists. synchronize_rcu() counts as a full memory barrier, so any changes the local CPU makes to the soon to be drained pcplists will be visible to the draining CPU once it returns. - Then the CPU can safely free the old pcplists. Nobody else holds a reference to it. Note that concurrent access to the remote pcplists drain is protected by the 'pcpu_drain_mutex'.