Re: [LSF/MM TOPIC] proposals for topics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Michal Hocko wrote:
>   Another issue is that GFP_NOFS is quite often used without any obvious
>   reason. It is not clear which lock is held and could be taken from
>   the reclaim path. Wouldn't it be much better if the no-recursion
>   behavior was bound to the lock scope rather than particular allocation
>   request? We already have something like this for PM
>   pm_res{trict,tore}_gfp_mask resp. memalloc_noio_{save,restore}. It
>   would be great if we could unify this and use the context based NOFS
>   in the FS.

Yes, I do want it. I think some of LSM hooks are called from GFP_NOFS context
but it is too difficult for me to tell whether we are using GFP_NOFS correctly.

>   First we shouldn't retry endlessly and rather fail the allocation and
>   allow the FS to handle the error. As per my experiments most FS cope
>   with that quite reasonably. Btrfs unfortunately handles many of those
>   failures by BUG_ON which is really unfortunate.

If it turned out that we are using GFP_NOFS from LSM hooks correctly,
I'd expect such GFP_NOFS allocations retry unless SIGKILL is pending.
Filesystems might be able to handle GFP_NOFS allocation failures. But
userspace might not be able to handle system call failures caused by
GFP_NOFS allocation failures; OOM-unkillable processes might unexpectedly
terminate as if they are OOM-killed. Would you please add GFP_KILLABLE
to list of the topics?

> - OOM killer has been discussed a lot throughout this year. We have
>   discussed this topic the last year at LSF and there has been quite some
>   progress since then. We have async memory tear down for the OOM victim
>   [2] which should help in many corner cases. We are still waiting
>   to make mmap_sem for write killable which would help in some other
>   classes of corner cases. Whatever we do, however, will not work in
>   100% cases. So the primary question is how far are we willing to go to
>   support different corner cases. Do we want to have a
>   panic_after_timeout global knob, allow multiple OOM victims after
>   a timeout?

A sequence for handling any corner case (as long as OOM killer is
invoked) was proposal at
http://lkml.kernel.org/r/201601222259.GJB90663.MLOJtFFOQFVHSO@xxxxxxxxxxxxxxxxxxx .

> - sysrq+f to trigger the oom killer follows some heuristics used by the
>   OOM killer invoked by the system which means that it is unreliable
>   and it might skip to kill any task without any explanation why. The
>   semantic of the knob doesn't seem to clear and it has been even
>   suggested [3] to remove it altogether as an unuseful debugging aid. Is
>   this really a general consensus?

Even if we remove SysRq-f from future kernels, please give us a fix for
current kernels. ;-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]