On 5/20/21 4:29 PM, Aaron Tomlin wrote: > A customer experienced a low-memory situation and decided to issue a > SIGKILL (i.e. a fatal signal). Instead of promptly terminating as one > would expect, the aforementioned task remained unresponsive. > > Further investigation indicated that the task was "stuck" in the > reclaim/compaction retry loop. Now, it does not make sense to retry > compaction when a fatal signal is pending. > > In the context of try_to_compact_pages(), indeed COMPACT_SKIPPED can be > returned; albeit, not every zone, on the zone list, would be considered > in the case a fatal signal is found to be pending. > Yet, in should_compact_retry(), given the last known compaction result, > each zone, on the zone list, can be considered/or checked > (see compaction_zonelist_suitable()). For example, if a zone was found > to succeed, then reclaim/compaction would be tried again > (notwithstanding the above). > > This patch ensures that compaction is not needlessly retried > irrespective of the last known compaction result e.g. if it was skipped, > in the unlikely case a fatal signal is found pending. > So, OOM is at least attempted. > > Signed-off-by: Aaron Tomlin <atomlin@xxxxxxxxxx> Reviewed-by: Vlastimil Babka <vbabka@xxxxxxx> > --- > mm/page_alloc.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index aaa1655cf682..b317057ac186 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4252,6 +4252,9 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags, > if (!order) > return false; > > + if (fatal_signal_pending(current)) > + return false; > + > if (compaction_made_progress(compact_result)) > (*compaction_retries)++; > >