On Fri, 30 Mar 2018 13:53:56 -0700 Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > It seems to me that what you're asking for at the moment is > lower-likelihood-of-failure-than-GFP_KERNEL, and it's not entirely > clear to me why your allocation is so much more important than other > allocations in the kernel. The ring buffer is for fast tracing and is allocated when a user requests it. Usually there's plenty of memory, but when a user wants a lot of tracing events stored, they may increase it themselves. > > Also, the pattern you have is very close to that of vmalloc. You're > allocating one page at a time to satisfy a multi-page request. In lieu > of actually thinking about what you should do, I might recommend using the > same GFP flags as vmalloc() which works out to GFP_KERNEL | __GFP_NOWARN > (possibly | __GFP_HIGHMEM if you can tolerate having to kmap the pages > when accessed from within the kernel). When the ring buffer was first created, we couldn't use vmalloc because vmalloc access wasn't working in NMIs (that has recently changed with lots of work to handle faults). But the ring buffer is broken up into pages (that are sent to the user or to the network), and allocating one page at a time makes everything work fine. The issue that happens when someone allocates a large ring buffer is that it will allocate all memory in the system before it fails. This means that there's a short time that any allocation will cause an OOM (which is what is happening). I think I agree with Joel and Zhaoyang, that we shouldn't allocate a ring buffer if there's not going to be enough memory to do it. If we can see the available memory before we start allocating one page at a time, and if the available memory isn't going to be sufficient, there's no reason to try to do the allocation, and simply send to the use -ENOMEM, and let them try something smaller. I'll take a look at si_mem_available() that Joel suggested and see if we can make that work. -- Steve