On Tue, Oct 30, 2018 at 02:25:51PM -0700, Matthew Wilcox wrote: > On Tue, Oct 30, 2018 at 11:51:17AM -0700, Andy Lutomirski wrote: > > Finally, one issue: rare_alloc() is going to utterly suck > > performance-wise due to the global IPI when the region gets zapped out > > of the direct map or otherwise made RO. This is the same issue that > > makes all existing XPO efforts so painful. We need to either optimize > > the crap out of it somehow or we need to make sure it’s not called > > except during rare events like device enumeration. > > Batching operations is kind of the whole point of the VM ;-) Either > this rare memory gets used a lot, in which case we'll want to create slab > caches for it, make it a MM zone and the whole nine yeards, or it's not > used very much in which case it doesn't matter that performance sucks. Yes, for the dynamic case something along those lines would be needed. If we have a single rare zone, we could even have __GFP_RARE or whatever that manages this. The page allocator would have to grow a rare memblock type, and every rare alloc would allocate from a rare memblock, when none is available, creation of a rare block would set up the mappings etc.. > For now, I'd suggest allocating 2MB chunks as needed, and having a > shrinker to hand back any unused pieces. Something like the percpu allocator?