On Thu, Feb 29, 2024 at 6:49 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Thu, Feb 29, 2024 at 09:39:17PM -0500, Kent Overstreet wrote: > > On Fri, Mar 01, 2024 at 01:16:18PM +1100, NeilBrown wrote: > > > Insisting that GFP_KERNEL allocations never returned NULL would allow us > > > to remove a lot of untested error handling code.... > > > > If memcg ever gets enabled for all kernel side allocations we might > > start seeing failures of GFP_KERNEL allocations. > > Why would we want that behaviour? A memcg-limited allocation should > behave like any other allocation -- block until we've freed some other > memory in this cgroup, either by swap or killing or ... I am not closely following this thread (although it is very interesting), but I don't think the same rules fully apply for memcg-limited allocations. Specifically, because the scope of the OOM killer and available resources is limited to the subtree of the memcg that hit its limit. Consider a case where a memcg is full of page cache memory that is mlock()'s by a process in another memcg, or full of tmpfs memory and there is no swap (or swap limit is reached). You can get more creative with this too, start a process in memcg A, allocate some anon memory, move the process to memcg B. Now if you cannot swap, you cannot reclaim the memory from memcg A and killing everything in memcg A doesn't help.