Re: [BUG] kernel BUG at mm/memcontrol.c:1074!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 19, 2012 at 12:16 AM, Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> On Thu, 19 Jan 2012, KAMEZAWA Hiroyuki wrote:
>> On Wed, 18 Jan 2012 19:41:44 -0800 (PST)
>> Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>> >
>> > I notice that, unlike Linus's git, this linux-next still has
>> > mm-isolate-pages-for-immediate-reclaim-on-their-own-lru.patch in.
>> >
>> > I think that was well capable of oopsing in mem_cgroup_lru_del_list(),
>> > since it didn't always know which lru a page belongs to.
>> >
>> > I'm going to be optimistic and assume that was the cause.
>> >
>> Hmm, because the log hits !memcg at lru "del", the page should be added
>> to LRU somewhere and the lru must be determined by pc->mem_cgroup.
>>
>> Once set, pc->mem_cgroup is not cleared, just overwritten. AFAIK, there is
>> only one chance to set pc->mem_cgroup as NULL... initalization.
>> I wonder why it hits lru_del() rather than lru_add()...
>> ................
>>
>> Ahhhh, ok, it seems you are right. the patch has following kinds of codes
>> ==
>> +static void pagevec_putback_immediate_fn(struct page *page, void *arg)
>> +{
>> +       struct zone *zone = page_zone(page);
>> +
>> +       if (PageLRU(page)) {
>> +               enum lru_list lru = page_lru(page);
>> +               list_move(&page->lru, &zone->lru[lru].list);
>> +       }
>> +}
>> ==
>> ..this will bypass mem_cgroup_lru_add(), and we can see bug in lru_del()
>> rather than lru_add()..
>
> I've not thought it through in detail (and your questioning reminds me
> that the worst I saw from that patch was updating of the wrong counts,
> leading to underflow, then livelock from the mismatch between empty list
> and enormous count: I never saw an oops from it, and may be mistaken).
>
>>
>> Another question is who pushes pages to LRU before setting pc->mem_cgroup..
>> Anyway, I think we need to fix memcg to be LRU_IMMEDIATE aware.
>
> I don't think so: Mel agreed that the patch could not go forward as is,
> without an additional pageflag, and asked Andrew to drop it from mmotm
> in mail on 29th December (I didn't notice an mm-commits message to say
> akpm did drop it, and marc is blacked out in protest for today, so I
> cannot check: but certainly akpm left it out of his push to Linus).
>
> Oh, and Mel noticed another bug in it on the 30th, that the PageLRU
> check in the function you quote above is wrong: see PATCH 11/11 thread.

So reverting this patch seems to indeed solve the issue (though
reverting wasn't clean - some minor conflicts in mm/swap.c).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]