Re: [PATCH 1/5] vmscan: remove all_unreclaimable check from direct reclaim path completely

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 23, 2011 at 2:21 PM, KOSAKI Motohiro
<kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
> Hi Minchan,
>
>> > zone->all_unreclaimable and zone->pages_scanned are neigher atomic
>> > variables nor protected by lock. Therefore a zone can become a state
>> > of zone->page_scanned=0 and zone->all_unreclaimable=1. In this case,
>>
>> Possible although it's very rare.
>
> Can you test by yourself andrey's case on x86 box? It seems
> reprodusable.
>
>> > current all_unreclaimable() return false even though
>> > zone->all_unreclaimabe=1.
>>
>> The case is very rare since we reset zone->all_unreclaimabe to zero
>> right before resetting zone->page_scanned to zero.
>> But I admit it's possible.
>
> Please apply this patch and run oom-killer. You may see following
> pages_scanned:0 and all_unreclaimable:yes combination. likes below.
> (but you may need >30min)
>
> Â Â Â ÂNode 0 DMA free:4024kB min:40kB low:48kB high:60kB active_anon:11804kB
> Â Â Â Âinactive_anon:0kB active_file:0kB inactive_file:4kB unevictable:0kB
> Â Â Â Âisolated(anon):0kB isolated(file):0kB present:15676kB mlocked:0kB
> Â Â Â Âdirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
> Â Â Â Âslab_unreclaimable:0kB kernel_stack:0kB pagetables:68kB unstable:0kB
> Â Â Â Âbounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>
>
>>
>> Â Â Â Â CPU 0 Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â CPU 1
>> free_pcppages_bulk               Âbalance_pgdat
>> Â Â Â Â zone->all_unreclaimabe = 0
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â zone->all_unreclaimabe = 1
>> Â Â Â Â zone->pages_scanned = 0
>> >
>> > Is this ignorable minor issue? No. Unfortunatelly, x86 has very
>> > small dma zone and it become zone->all_unreclamble=1 easily. and
>> > if it becase all_unreclaimable, it never return all_unreclaimable=0
>> Â Â Â Â ^^^^^ it's very important verb. Â Â^^^^^ return? reset?
>>
>> Â Â Â Â I can't understand your point due to the typo. Please correct the typo.
>>
>> > beucase it typicall don't have reclaimable pages.
>>
>> If DMA zone have very small reclaimable pages or zero reclaimable pages,
>> zone_reclaimable() can return false easily so all_unreclaimable() could return
>> true. Eventually oom-killer might works.
>
> The point is, vmscan has following all_unreclaimable check in several place.
>
> Â Â Â Â Â Â Â Â Â Â Â Âif (zone->all_unreclaimable && priority != DEF_PRIORITY)
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âcontinue;
>
> But, if the zone has only a few lru pages, get_scan_count(DEF_PRIORITY) return
> {0, 0, 0, 0} array. It mean zone will never scan lru pages anymore. therefore
> false negative smaller pages_scanned can't be corrected.
>
> Then, false negative all_unreclaimable() also can't be corrected.
>
>
> btw, Why get_scan_count() return 0 instead 1? Why don't we round up?
> Git log says it is intentionally.
>
> Â Â Â Âcommit e0f79b8f1f3394bb344b7b83d6f121ac2af327de
> Â Â Â ÂAuthor: Johannes Weiner <hannes@xxxxxxxxxxxx>
> Â Â Â ÂDate: Â Sat Oct 18 20:26:55 2008 -0700
>
> Â Â Â Â Â Âvmscan: don't accumulate scan pressure on unrelated lists
>
>>
>> In my test, I saw the livelock, too so apparently we have a problem.
>> I couldn't dig in it recently by another urgent my work.
>> I think you know root cause but the description in this patch isn't enough
>> for me to be persuaded.
>>
>> Could you explain the root cause in detail?
>
> If you have an another fixing idea, please let me know. :)
>
>
>
>

Okay. I got it.

The problem is following as.
By the race the free_pcppages_bulk and balance_pgdat, it is possible
zone->all_unreclaimable = 1 and zone->pages_scanned = 0.
DMA zone have few LRU pages and in case of no-swap and big memory
pressure, there could be a just a page in inactive file list like your
example. (anon lru pages isn't important in case of non-swap system)
In such case, shrink_zones doesn't scan the page at all until priority
become 0 as get_scan_count does scan >>= priority(it's mostly zero).
And although priority become 0, nr_scan_try_batch returns zero until
saved pages become 32. So for scanning the page, at least, we need 32
times iteration of priority 12..0.  If system has fork-bomb, it is
almost livelock.

If is is right, how about this?

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 148c6e6..34983e1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1973,6 +1973,9 @@ static void shrink_zones(int priority, struct
zonelist *zonelist,

 static bool zone_reclaimable(struct zone *zone)
 {
+       if (zone->all_unreclaimable)
+               return false;
+
        return zone->pages_scanned < zone_reclaimable_pages(zone) * 6;
 }


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]