Re: [PATCH] mm: avoid livelock on !__GFP_FS allocations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 26, 2011 at 12:22:14AM -0700, Colin Cross wrote:
> On Wed, Oct 26, 2011 at 12:10 AM, David Rientjes <rientjes@xxxxxxxxxx> wrote:
> > On Tue, 25 Oct 2011, Colin Cross wrote:
> >
> >> > gfp_allowed_mask is initialized to GFP_BOOT_MASK to start so that __GFP_FS
> >> > is never allowed before the slab allocator is completely initialized, so
> >> > you've now implicitly made all early boot allocations to be __GFP_NORETRY
> >> > even though they may not pass it.
> >>
> >> Only before interrupts are enabled, and then isn't it vulnerable to
> >> the same livelock?  Interrupts are off, single cpu, kswapd can't run.
> >> If an allocation ever failed, which seems unlikely, why would retrying
> >> help?
> >>
> >
> > If you want to claim gfp_allowed_mask as a pm-only entity, then I see no
> > problem with this approach.  However, if gfp_allowed_mask would be allowed
> > to temporarily change after init for another purpose then it would make
> > sense to retry because another allocation with __GFP_FS on another cpu or
> > kswapd could start making progress could allow for future memory freeing.
> >
> > The suggestion to add a hook directly into a pm-interface was so that we
> > could isolate it only to suspend and, to me, is the most maintainable
> > solution.
> >
> 
> pm_restrict_gfp_mask seems to claim gfp_allowed_mask as owned by pm at runtime:
> "gfp_allowed_mask also should only be modified with pm_mutex held,
> unless the suspend/hibernate code is guaranteed not to run in parallel
> with that modification"
> 
> I think we've wrapped around to Mel's original patch, which adds a
> pm_suspending() helper that is implemented next to
> pm_restrict_gfp_mask.  His patch puts the check inside
> !did_some_progress instead of should_alloc_retry, which I prefer as it
> at least keeps trying until reclaim isn't working.  Pekka was trying
> to avoid adding pm-specific checks into the allocator, which is why I
> stuck to the symptom (__GFP_FS is clear) rather than the cause (PM).
> 

Right now, I'm still no seeing a problem with the pm_suspending() check
as it's made for a corner-case situation in a very slow path that is
self-documenting. This thread has died somewhat and there is still no
fix merged. Is someone cooking up a patch they would prefer as an
alternative? If not, I'm going to resubmit the fix based on
pm_suspending.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]