Michal Hocko wrote: > Another issue is that GFP_NOFS is quite often used without any obvious > reason. It is not clear which lock is held and could be taken from > the reclaim path. Wouldn't it be much better if the no-recursion > behavior was bound to the lock scope rather than particular allocation > request? We already have something like this for PM > pm_res{trict,tore}_gfp_mask resp. memalloc_noio_{save,restore}. It > would be great if we could unify this and use the context based NOFS > in the FS. Yes, I do want it. I think some of LSM hooks are called from GFP_NOFS context but it is too difficult for me to tell whether we are using GFP_NOFS correctly. > First we shouldn't retry endlessly and rather fail the allocation and > allow the FS to handle the error. As per my experiments most FS cope > with that quite reasonably. Btrfs unfortunately handles many of those > failures by BUG_ON which is really unfortunate. If it turned out that we are using GFP_NOFS from LSM hooks correctly, I'd expect such GFP_NOFS allocations retry unless SIGKILL is pending. Filesystems might be able to handle GFP_NOFS allocation failures. But userspace might not be able to handle system call failures caused by GFP_NOFS allocation failures; OOM-unkillable processes might unexpectedly terminate as if they are OOM-killed. Would you please add GFP_KILLABLE to list of the topics? > - OOM killer has been discussed a lot throughout this year. We have > discussed this topic the last year at LSF and there has been quite some > progress since then. We have async memory tear down for the OOM victim > [2] which should help in many corner cases. We are still waiting > to make mmap_sem for write killable which would help in some other > classes of corner cases. Whatever we do, however, will not work in > 100% cases. So the primary question is how far are we willing to go to > support different corner cases. Do we want to have a > panic_after_timeout global knob, allow multiple OOM victims after > a timeout? A sequence for handling any corner case (as long as OOM killer is invoked) was proposal at http://lkml.kernel.org/r/201601222259.GJB90663.MLOJtFFOQFVHSO@xxxxxxxxxxxxxxxxxxx . > - sysrq+f to trigger the oom killer follows some heuristics used by the > OOM killer invoked by the system which means that it is unreliable > and it might skip to kill any task without any explanation why. The > semantic of the knob doesn't seem to clear and it has been even > suggested [3] to remove it altogether as an unuseful debugging aid. Is > this really a general consensus? Even if we remove SysRq-f from future kernels, please give us a fix for current kernels. ;-) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html