Re: [BUG] fatal hang untarring 90GB file, possibly writeback related.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2011-04-28 at 16:08 +0100, Mel Gorman wrote:

[ text deleted ]

> Another consequence of this patch is that when high order allocations
> are in progress (is the test case fork heavy in any way for
> example? alternatively, it might be something in the storage stack
> that requires high-order allocs) we are no longer necessarily going
> to sleep because of should_reclaim_continue() check. This could
> explain kswapd-at-99% but would only apply if CONFIG_COMPACTION is
> set (does unsetting CONFIG_COMPACTION help). If the bug only triggers
> for CONFIG_COMPACTION, does the following *untested* patch help any?

Afraid to report this patch didn't help either.
> 
> (as a warning, I'm offline Friday until Tuesday morning. I'll try
> check mail over the weekend but it's unlikely I'll find a terminal
> or be allowed to use it without an ass kicking)

Ditto, me, to, I will pick this up Tuesday.
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 148c6e6..c74a501 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1842,15 +1842,22 @@ static inline bool should_continue_reclaim(struct zone *zone,
>  		return false;
>  
>  	/*
> -	 * If we failed to reclaim and have scanned the full list, stop.
> -	 * NOTE: Checking just nr_reclaimed would exit reclaim/compaction far
> -	 *       faster but obviously would be less likely to succeed
> -	 *       allocation. If this is desirable, use GFP_REPEAT to decide
> -	 *       if both reclaimed and scanned should be checked or just
> -	 *       reclaimed
> +	 * For direct reclaimers
> +	 *   If we failed to reclaim and have scanned the full list, stop.
> +	 *   The caller will check congestion and sleep if necessary until
> +	 *   some IO completes.
> +	 * For kswapd
> +	 *   Check just nr_reclaimed. If we are failing to reclaim, we
> +	 *   want to stop this reclaim loop, increase the priority and
> +	 *   go to sleep if necessary to allow IO a change to complete.
> +	 *   This avoids kswapd going into a busy loop in shrink_zone()
>  	 */
> -	if (!nr_reclaimed && !nr_scanned)
> -		return false;
> +	if (!nr_reclaimed) {
> +		if (current_is_kswapd())
> +			return false;
> +		else if (!nr_scanned)
> +			return false;
> +	}
>  
>  	/*
>  	 * If we have not reclaimed enough pages for compaction and the
> @@ -1924,8 +1931,13 @@ restart:
>  
>  	/* reclaim/compaction might need reclaim to continue */
>  	if (should_continue_reclaim(zone, nr_reclaimed,
> -					sc->nr_scanned - nr_scanned, sc))
> +					sc->nr_scanned - nr_scanned, sc)) {
> +		/* Throttle direct reclaimers if congested */
> +		if (!current_is_kswapd())
> +			wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10);
> +
>  		goto restart;
> +	}
>  
>  	throttle_vm_writeout(sc->gfp_mask);
>  }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]