On Fri, 2008-06-20 at 10:13 +0900, KAMEZAWA Hiroyuki wrote: > Lee-san, this is an additonal one.. > Not-tested-yet, just by review. OK, I'll test this on my x86_64 platform, which doesn't seem to hit the soft lockups. > > Fixing page_lock() <-> zone->lock nesting of bad-behavior. > > Before: > lock_page()(TestSetPageLocked()) > spin_lock(zone->lock) > unlock_page() > spin_unlock(zone->lock) Couple of comments: * I believe that the locks are acquired in the right order--at least as documented in the comments in mm/rmap.c. * The unlocking appears out of order because this function attempts to hold the zone lock across a few pages in the pagevec, but must switch to a different zone lru lock when it finds a page on a different zone from the zone whose lock it is holding--like in the pagevec draining functions, altho' they don't need to lock the page. > After: > spin_lock(zone->lock) > spin_unlock(zone->lock) Right. With your reworked check_move_unevictable_page() [with retry], we don't need to lock the page here, any more. That means we can revert all of the changes to pass the mapping back to sys_shmctl() and move the call to scan_mapping_unevictable_pages() back to shmem_lock() after clearing the address_space's unevictable flag. We only did that to avoid sleeping while holding the shmem_inode_info lock and the shmid_kernel's ipc_perm spinlock. Shall I handle that, after we've tested this patch? > > Including nit-pick fix. (I'll ask Kosaki-san to merge this to his 5/5) > > Hmm... > > --- > mm/vmscan.c | 25 +++++-------------------- > 1 file changed, 5 insertions(+), 20 deletions(-) > > Index: test-2.6.26-rc5-mm3/mm/vmscan.c > =================================================================== > --- test-2.6.26-rc5-mm3.orig/mm/vmscan.c > +++ test-2.6.26-rc5-mm3/mm/vmscan.c > @@ -1106,7 +1106,7 @@ static unsigned long shrink_inactive_lis > if (nr_taken == 0) > goto done; > > - spin_lock(&zone->lru_lock); > + spin_lock_irq(&zone->lru_lock); 1) It appears that the spin_lock() [no '_irq'] was there because irqs are disabled a few lines above so that we could use non-atomic __count[_zone]_vm_events(). 2) I think this predates the split lru or unevictable lru patches, so these changes are unrelated. > /* > * Put back any unfreeable pages. > */ > @@ -1136,9 +1136,8 @@ static unsigned long shrink_inactive_lis > } > } > } while (nr_scanned < max_scan); > - spin_unlock(&zone->lru_lock); > + spin_unlock_irq(&zone->lru_lock); > done: > - local_irq_enable(); > pagevec_release(&pvec); > return nr_reclaimed; > } > @@ -2438,7 +2437,7 @@ static void show_page_path(struct page * > */ > static void check_move_unevictable_page(struct page *page, struct zone *zone) > { > - > +retry: > ClearPageUnevictable(page); /* for page_evictable() */ We can remove this comment ^^^^^^^^^^^^^^^^^^^^^^^^^^ page_evictable() no longer asserts !PageUnevictable(), right? > if (page_evictable(page, NULL)) { > enum lru_list l = LRU_INACTIVE_ANON + page_is_file_cache(page); > @@ -2455,6 +2454,8 @@ static void check_move_unevictable_page( > */ > SetPageUnevictable(page); > list_move(&page->lru, &zone->lru[LRU_UNEVICTABLE].list); > + if (page_evictable(page, NULL)) > + goto retry; > } > } > > @@ -2494,16 +2495,6 @@ void scan_mapping_unevictable_pages(stru > next = page_index; > next++; > > - if (TestSetPageLocked(page)) { > - /* > - * OK, let's do it the hard way... > - */ > - if (zone) > - spin_unlock_irq(&zone->lru_lock); > - zone = NULL; > - lock_page(page); > - } > - > if (pagezone != zone) { > if (zone) > spin_unlock_irq(&zone->lru_lock); > @@ -2514,8 +2505,6 @@ void scan_mapping_unevictable_pages(stru > if (PageLRU(page) && PageUnevictable(page)) > check_move_unevictable_page(page, zone); > > - unlock_page(page); > - > } > if (zone) > spin_unlock_irq(&zone->lru_lock); > @@ -2551,15 +2540,11 @@ void scan_zone_unevictable_pages(struct > for (scan = 0; scan < batch_size; scan++) { > struct page *page = lru_to_page(l_unevictable); > > - if (TestSetPageLocked(page)) > - continue; > - > prefetchw_prev_lru_page(page, l_unevictable, flags); > > if (likely(PageLRU(page) && PageUnevictable(page))) > check_move_unevictable_page(page, zone); > > - unlock_page(page); > } > spin_unlock_irq(&zone->lru_lock); > > I'll let you know how it goes. Later, Lee -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html