Re: Kernel falls apart under light memory pressure (i.e. linking vmlinux)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, May 22, 2011 at 9:22 PM, Andrew Lutomirski <luto@xxxxxxx> wrote:
> On Sat, May 21, 2011 at 10:44 AM, Minchan Kim <minchan.kim@xxxxxxxxx> wrote:
>> I would like to confirm this problem.
>> Could you show the diff of 2.6.38.6 with current your 2.6.38.6 + alpha?
>> (ie, I would like to know that what patches you add up on vanilla
>> 2.6.38.6 to reproduce this problem)
>> I believe you added my crap below patch. Right?
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 292582c..69d317e 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -311,7 +311,8 @@ static void set_reclaim_mode(int priority, struct
>> scan_control *sc,
>> Â Â Â Â*/
>> Â Â Â if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
>> Â Â Â Â Â Â Â sc->reclaim_mode |= syncmode;
>> - Â Â Â else if (sc->order && priority < DEF_PRIORITY - 2)
>> + Â Â Â else if ((sc->order && priority < DEF_PRIORITY - 2) ||
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â prioiry <= DEF_PRIORITY / 3)
>> Â Â Â Â Â Â Â sc->reclaim_mode |= syncmode;
>> Â Â Â else
>> Â Â Â Â Â Â Â sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC;
>> @@ -1349,10 +1350,6 @@ static inline bool
>> should_reclaim_stall(unsigned long nr_taken,
>> Â Â Â if (current_is_kswapd())
>> Â Â Â Â Â Â Â return false;
>>
>> - Â Â Â /* Only stall on lumpy reclaim */
>> - Â Â Â if (sc->reclaim_mode & RECLAIM_MODE_SINGLE)
>> - Â Â Â Â Â Â Â return false;
>> -
>
> Bah. ÂIt's this last hunk. ÂWithout this I can't reproduce the oops.
> With this hunk, the reset_reclaim_mode doesn't work and
> shrink_page_list is incorrectly called twice.

OMG! I should have said more clearly to you.  Above my patch is totally _crap_.
I thought you have experimented test without above crap patch. :(
Sorry for consuming time of many mm guys.
My apologies.

I want to resolve your original problem(ie, hang) before digging the
OOM problem.

>
> So we're back to the original problem...

Could you test below patch based on vanilla 2.6.38.6?
The expect result is that system hang never should happen.
I hope this is last test about hang.

Thanks.

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 292582c..1663d24 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -231,8 +231,11 @@ unsigned long shrink_slab(struct shrink_control *shrink,
       if (scanned == 0)
               scanned = SWAP_CLUSTER_MAX;

-       if (!down_read_trylock(&shrinker_rwsem))
-               return 1;       /* Assume we'll be able to shrink next time */
+       if (!down_read_trylock(&shrinker_rwsem)) {
+               /* Assume we'll be able to shrink next time */
+               ret = 1;
+               goto out;
+       }

       list_for_each_entry(shrinker, &shrinker_list, list) {
               unsigned long long delta;
@@ -286,6 +289,8 @@ unsigned long shrink_slab(struct shrink_control *shrink,
               shrinker->nr += total_scan;
       }
       up_read(&shrinker_rwsem);
+out:
+       cond_resched();
       return ret;
 }

@@ -2331,7 +2336,7 @@ static bool sleeping_prematurely(pg_data_t
*pgdat, int order, long remaining,
        * must be balanced
        */
       if (order)
-               return pgdat_balanced(pgdat, balanced, classzone_idx);
+               return !pgdat_balanced(pgdat, balanced, classzone_idx);
       else
               return !all_zones_ok;
 }

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]