On 4/28/20 11:48 PM, David Rientjes wrote: > On Tue, 28 Apr 2020, Vlastimil Babka wrote: > > Yes, order-0 reclaim capture is interesting since the issue being reported > here is userspace going out to lunch because it loops for an unbounded > amount of time trying to get above a watermark where it's allowed to > allocate and other consumers are depleting that resource. > > We actually prefer to oom kill earlier rather than being put in a > perpetual state of aggressive reclaim that affects all allocators and the > unbounded nature of those allocations leads to very poor results for > everybody. Sure. My vague impression is that your (and similar cloud companies) kind of workloads are designed to maximize machine utilization, and overshooting and killing something as a result is no big deal. Then you perhaps have more probability of hitting this state, and on the other hand, even an occasional premature oom kill is not a big deal? My concers are workloads not designed in such a way, where premature oom kill due to temporary higher reclaim activity together with burst of incoming network packets will result in e.g. killing an important database. There, the tradeoff looks different. > I'm happy to scope this solely to an order-0 reclaim capture. I'm not > sure if I'm clear on whether this has been worked on before and patches > existed in the past? Andrew mentioned some. I don't recall any, so it might have been before my time. > Somewhat related to what I described in the changelog: we lost the "page > allocation stalls" artifacts in the kernel log for 4.15. The commit > description references an asynchronous mechanism for getting this > information; I don't know where this mechanism currently lives. >