On Mon, Mar 11, 2019 at 1:46 PM Sultan Alsawaf <sultan@xxxxxxxxxxxxxxx> wrote: > > On Mon, Mar 11, 2019 at 01:10:36PM -0700, Suren Baghdasaryan wrote: > > The idea seems interesting although I need to think about this a bit > > more. Killing processes based on failed page allocation might backfire > > during transient spikes in memory usage. > > This issue could be alleviated if tasks could be killed and have their pages > reaped faster. Currently, Linux takes a _very_ long time to free a task's memory > after an initial privileged SIGKILL is sent to a task, even with the task's > priority being set to the highest possible (so unwanted scheduler preemption > starving dying tasks of CPU time is not the issue at play here). I've > frequently measured the difference in time between when a SIGKILL is sent for a > task and when free_task() is called for that task to be hundreds of > milliseconds, which is incredibly long. AFAIK, this is a problem that LMKD > suffers from as well, and perhaps any OOM killer implementation in Linux, since > you cannot evaluate effect you've had on memory pressure by killing a process > for at least several tens of milliseconds. Yeah, killing speed is a well-known problem which we are considering in LMKD. For example the recent LMKD change to assign process being killed to a cpuset cgroup containing big cores cuts the kill time considerably. This is not ideal and we are thinking about better ways to expedite the cleanup process. > > AFAIKT the biggest issue with using this approach in userspace is that > > it's not practically implementable without heavy in-kernel support. > > How to implement such interaction between kernel and userspace would > > be an interesting discussion which I would be happy to participate in. > > You could signal a lightweight userspace process that has maximum scheduler > priority and have it kill the tasks it'd like. This what LMKD currently is - a userspace RT process. My point was that this page allocation queue that you implemented can't be implemented in userspace, at least not without extensive communication with kernel. > Thanks, > Sultan Thanks, Suren.