On Mon, Sep 25, 2017 at 2:17 PM, Chris Mason <clm@xxxxxx> wrote: > > My understanding is that for order-0 page allocations and > kmem_cache_alloc(buffer_heads), GFP_NOFS is going to either loop forever or > at the very least OOM kill something before returning NULL? That should generally be true. We've occasionally screwed up in the VM, so an explicit GFP_NOFAIL would definitely be best if we then remove the looping in fs/buffer.c. >> What is it that triggers that many buffer heads in the first place? >> Because I thought we'd gotten to the point where all normal file IO >> can avoid the buffer heads entirely, and just directly work with >> making bio's from the pages. > > We're not triggering free_more_memory(). I ran a probe on a few production > machines and it didn't fire once over a 90 minute period of heavy load. The > main target of Jens' patchset was preventing shrink_inactive_list() -> > wakeup_flusher_threads() from creating millions of work items without any > rate limiting at all. So the two things I reacted to in that patch series were apparently things that you guys don't even care about. I reacted to the fs/buffer.c code, and to the change in laptop mode to not do circular writeback. The latter is another "it's probably ok, but it can be a subtle change". In particular, things that re-write the same thing over and over again can get very different behavior, even when you write out "all" pages. And I'm assuming you're not using laptop mode either on your servers (that sounds insane, but I remember somebody actually ended up using laptop mode even on servers, simply because they did *not* want the regular timed writeback model, so it's not quite as insane as it sounds). Linus