Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Reclamation interactions with RCU

Kent Overstreet <kent.overstreet@xxxxxxxxx> · Thu, 29 Feb 2024 22:09:30 -0500

On Fri, Mar 01, 2024 at 02:48:52AM +0000, Matthew Wilcox wrote:
> On Thu, Feb 29, 2024 at 09:39:17PM -0500, Kent Overstreet wrote:
> > On Fri, Mar 01, 2024 at 01:16:18PM +1100, NeilBrown wrote:
> > > Insisting that GFP_KERNEL allocations never returned NULL would allow us
> > > to remove a lot of untested error handling code....
> > 
> > If memcg ever gets enabled for all kernel side allocations we might
> > start seeing failures of GFP_KERNEL allocations.
> 
> Why would we want that behaviour?  A memcg-limited allocation should
> behave like any other allocation -- block until we've freed some other
> memory in this cgroup, either by swap or killing or ...

It's not uncommon to have a more efficient way of doing something if you
can allocate more memory, but still have the ability to run in a more
bounded amount of space if you need to; I write code like this quite
often.

Or maybe you just want the syscall to return an error instead of
blocking for an unbounded amount of time if userspace asks for something
silly.

Honestly, relying on the OOM killer and saying that because that now we
don't have to write and test your error paths is a lazy cop out.

The same kind of thinking got us overcommit, where yes we got an
increase in efficiency, but the cost was that everyone started assuming
and relying on overcommit, so now it's impossible to run without
overcommit enabled except in highly controlled environments.

And that means allocation failure as an effective signal is just
completely busted in userspace. If you want to write code in userspace
that uses as much memory as is available and no more, you _can't_,
because system behaviour goes to shit if you have overcommit enabled or
a bunch of memory gets wasted if overcommit is disabled because everyone
assumes that's just what you do.

Let's _not_ go that route in the kernel. I have pointy sticks to
brandish at people who don't want to deal with properly handling errors.