On Fri, Apr 12, 2019 at 7:14 AM Daniel Colascione <dancol@xxxxxxxxxx> wrote: > > On Thu, Apr 11, 2019 at 11:53 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > On Thu 11-04-19 08:33:13, Matthew Wilcox wrote: > > > On Wed, Apr 10, 2019 at 06:43:53PM -0700, Suren Baghdasaryan wrote: > > > > Add new SS_EXPEDITE flag to be used when sending SIGKILL via > > > > pidfd_send_signal() syscall to allow expedited memory reclaim of the > > > > victim process. The usage of this flag is currently limited to SIGKILL > > > > signal and only to privileged users. > > > > > > What is the downside of doing expedited memory reclaim? ie why not do it > > > every time a process is going to die? > > > > Well, you are tearing down an address space which might be still in use > > because the task not fully dead yeat. So there are two downsides AFAICS. > > Core dumping which will not see the reaped memory so the resulting > > Test for SIGNAL_GROUP_COREDUMP before doing any of this then. If you > try to start a core dump after reaping begins, too bad: you could have > raced with process death anyway. > > > coredump might be incomplete. And unexpected #PF/gup on the reaped > > memory will result in SIGBUS. > > It's a dying process. Why even bother returning from the fault > handler? Just treat that situation as a thread exit. There's no need > to make this observable to userspace at all. Just for clarity, checking the code, I think we already do this. zap_other_threads sets SIGKILL pending on every thread in the group, and we'll handle SIGKILL in the process of taking any page fault or doing any system call, so I don't think it's actually possible for a thread in a dying process to observe the SIGBUS that reaping in theory can generate.