Re: [LSF/MM TOPIC] proposals for topics

Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> · Tue, 26 Jan 2016 00:08:28 +0900

Michal Hocko wrote:
>   Another issue is that GFP_NOFS is quite often used without any obvious
>   reason. It is not clear which lock is held and could be taken from
>   the reclaim path. Wouldn't it be much better if the no-recursion
>   behavior was bound to the lock scope rather than particular allocation
>   request? We already have something like this for PM
>   pm_res{trict,tore}_gfp_mask resp. memalloc_noio_{save,restore}. It
>   would be great if we could unify this and use the context based NOFS
>   in the FS.

Yes, I do want it. I think some of LSM hooks are called from GFP_NOFS context
but it is too difficult for me to tell whether we are using GFP_NOFS correctly.

>   First we shouldn't retry endlessly and rather fail the allocation and
>   allow the FS to handle the error. As per my experiments most FS cope
>   with that quite reasonably. Btrfs unfortunately handles many of those
>   failures by BUG_ON which is really unfortunate.

If it turned out that we are using GFP_NOFS from LSM hooks correctly,
I'd expect such GFP_NOFS allocations retry unless SIGKILL is pending.
Filesystems might be able to handle GFP_NOFS allocation failures. But
userspace might not be able to handle system call failures caused by
GFP_NOFS allocation failures; OOM-unkillable processes might unexpectedly
terminate as if they are OOM-killed. Would you please add GFP_KILLABLE
to list of the topics?

> - OOM killer has been discussed a lot throughout this year. We have
>   discussed this topic the last year at LSF and there has been quite some
>   progress since then. We have async memory tear down for the OOM victim
>   [2] which should help in many corner cases. We are still waiting
>   to make mmap_sem for write killable which would help in some other
>   classes of corner cases. Whatever we do, however, will not work in
>   100% cases. So the primary question is how far are we willing to go to
>   support different corner cases. Do we want to have a
>   panic_after_timeout global knob, allow multiple OOM victims after
>   a timeout?

A sequence for handling any corner case (as long as OOM killer is
invoked) was proposal at
http://lkml.kernel.org/r/201601222259.GJB90663.MLOJtFFOQFVHSO@xxxxxxxxxxxxxxxxxxx .

> - sysrq+f to trigger the oom killer follows some heuristics used by the
>   OOM killer invoked by the system which means that it is unreliable
>   and it might skip to kill any task without any explanation why. The
>   semantic of the knob doesn't seem to clear and it has been even
>   suggested [3] to remove it altogether as an unuseful debugging aid. Is
>   this really a general consensus?

Even if we remove SysRq-f from future kernels, please give us a fix for
current kernels. ;-)

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html