Michal Hocko wrote: > On Wed 19-10-16 20:27:53, Tetsuo Handa wrote: > [...] > > What I'm talking about is "why don't you stop playing whack-a-mole games > > with missing warn_alloc() calls". I don't blame you for not having a good > > idea, but I blame you for not having a reliable warn_alloc() mechanism. > > Look, it seems pretty clear that our priorities and viewes are quite > different. While I believe that we should solve real issues in a > reliable and robust way you seem to love to be have as much reporting as > possible. I do agree that reporting is important part of debugging of > problems but as your previous attempts for the allocation watchdog show > a proper and bullet proof reporting requires state tracking and is in > general too complex for something that doesn't happen in most properly > configured systems. Maybe there are other ways but my time is better > spent on something more useful - like making the direct reclaim path > more deterministic without any unbound loops. Properly configured systems should not be bothered by low memory situations. There are systems which are bothered by low memory situations. It is pointless to refer to "properly configured systems" as a reason not to add a watchdog. It is administrators who decide whether to use a watchdog. > > So let's agree to disagree about importance of the reliability > warn_alloc. I see it as an improvement which doesn't really have to be > perfect. I don't expect kmallocwd alone to be perfect. I expect kmallocwd to serve as a hook. For example, it will be possible to turn on collecting perf data when kmallocwd found a stalling thread and turn off when kmallocwd found none. Since necessary information are stored in the task struct, it will be easy to include them into perf data. Likewise, it will be easy to extract them using a script for /usr/bin/crash when an administrator captured a vmcore image of a stalling KVM guest. Sending vmcore images to support centers is difficult due to file size and security reasons. It is nice if we can get a clue by reading the task list. But warn_alloc() can't serve as a hook. I see kmallocwd as an improvement which doesn't really have to be perfect. By the way, regarding "making the direct reclaim path more deterministic" part, I wish that we can (1) introduce phased watermarks which varies based on stage of reclaim operation (e.g. watermark_lower()/watermark_higher() which resembles preempt_disable()/preempt_enable() but is propagated to other threads when delegating operations needed for reclaim to other threads). (2) introduce dedicated kernel threads which perform only specific reclaim operation, using watermark propagated from other threads which performs different reclaim operation. (3) remove direct reclaim which bothers callers with managing correct GFP_NOIO / GFP_NOFS / GFP_KERNEL distinction. Then, normal ___GFP_DIRECT_RECLAIM callers can simply wait for wait_event(get_pages_from_freelist() succeeds) rather than polling with complicated short sleep. This will significantly save CPU resource (especially when oom_lock is held) which is wasted by activities by multiple concurrent direct reclaim. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>