Re: hunting an IO hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 17, 2011 at 05:32:22PM +0100, Johannes Weiner wrote:
> On Mon, Jan 17, 2011 at 10:02:47AM -0500, Chris Mason wrote:
> > Excerpts from Chris Mason's message of 2011-01-17 09:07:40 -0500:
> > 
> > [ various crashes under load with current git ]
> > 
> > > 
> > > I did have CONFIG_COMPACTION off for my latest reproduce.  The last two
> > > have been corruption on the page->lru lists, maybe that'll help narrow
> > > our bisect pool down.
> > 
> > I've reverted 744ed1442757767ffede5008bb13e0805085902e, and
> > d8505dee1a87b8d41b9c4ee1325cd72258226fbc and the run has lasted longer
> > than any runs in the past.
> > 
> > I'll give this a few hours but they seem the most related to my various
> > crashes so far.
> 
> I went through the new batched activation code.  Shaohua, can you
> explain to me why the following sequence is not possible?
> 
> 1. CPU A and B schedule activation of a page (PG_lru && !PG_active)
> 2. CPU A flushes the page to the active list (PG_lru && PG_active)
> 3. CPU A isolates the page for scanning/migration and
>    puts it on private list (!PG_lru && PG_active)
> 4. CPU B flushes the page to the active list (!PG_lru && PG_active),
>    the deferred activation code now assumes putback mode and adds the page
>    to the active list, thus corrupting the link to the private list of CPU A
> 5. CPU A does list_del() from the private list (like unmap_and_move() does)
>    and trips up on the corruption
> 

In addition, PageLRU is a bad test in __activate_page for deciding whether
the page needs to be unlinked. When a page is on a pagevec, it's not an LRU
page and it's not on a linked list. When a page is on a private linked list,
it's not an LRU page but it has to be removed from the private list before
adding to the LRU to avoid list corruption.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]