Hello, On Sun, Oct 25, 2015 at 07:52:59PM +0900, Tetsuo Handa wrote: ... > This means that any kernel code which invokes a __GFP_WAIT allocation > might fail to do (4) when invoked via workqueue, regardless of flags > passed to alloc_workqueue()? Sounds that way and yeah (3) should technically be okay and that's why HIGHPRI was implemented the way it was at the beginning; however, in practice, this is the first time it's noticeable in all the years. I think it comes down to the fact that there just aren't many places which need such looping behavior and even in those places it's often very undesirable to busy-loop while not making forward-progress (and if forward-progress is being made, it won't be indefinite). > I think that inserting a short sleep into page allocator is better > because the vmstat_update fix will not require workqueue tweaks if > we sleep inside page allocator. Also, from the point of view of > protecting page allocator from going unresponsive when hundreds of tasks > started busy-waiting at __alloc_pages_slowpath() because we can observe > that XXX value in the "MemAlloc-Info: XXX stalling task," line grows > when we are unable to make forward progress. This looks good to me too; however, it still needs a dedicated workqueue with WQ_MEM_RECLAIM set. That deadlock probably is very unlikely as the side effect of vmstat failing to execute due to worker exhaustion is more memory reclaim but it still is theoretically possible and it could just be that it happens at low enough frequency that it hasn't been reported yet. Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>