On Thu, Mar 20, 2014 at 10:16:21PM -0700, Hugh Dickins wrote: > On Thu, 20 Mar 2014, Naoya Horiguchi wrote: > > On Thu, Mar 20, 2014 at 09:47:04PM -0400, Sasha Levin wrote: > > > On 02/10/2014 04:44 PM, Naoya Horiguchi wrote: > > > >swapin_walk_pmd_entry() is defined as pmd_entry(), but it has no code > > > >about pmd handling (except pmd_none_or_trans_huge_or_clear_bad, but the > > > >same check are now done in core page table walk code). > > > >So let's move this function on pte_entry() as swapin_walk_pte_entry(). > > > > > > > >Signed-off-by: Naoya Horiguchi<n-horiguchi@xxxxxxxxxxxxx> > > > > > > This patch seems to generate: > > > > Sasha, thank you for reporting. > > I forgot to unlock ptlock before entering read_swap_cache_async() which > > holds page lock in it, as a result lock ordering rule (written in mm/rmap.c) > > was violated (we should take in the order of mmap_sem -> page lock -> ptlock.) > > The following patch should fix this. Could you test with it? > > > > --- > > From c0d56af5874dc40467c9b3a0f9e53b39b3c4f1c5 Mon Sep 17 00:00:00 2001 > > From: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> > > Date: Thu, 20 Mar 2014 22:30:51 -0400 > > Subject: [PATCH] madvise: fix locking in force_swapin_readahead() > > > > We take mmap_sem and ptlock in walking over ptes with swapin_walk_pte_entry(), > > but inside it we call read_swap_cache_async() which holds page lock. > > So we should unlock ptlock to call read_swap_cache_async() to meet lock order > > rule (mmap_sem -> page lock -> ptlock). > > > > Reported-by: Sasha Levin <sasha.levin@xxxxxxxxxx> > > Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> > > NAK. You are now unlocking and relocking the spinlock, good; but on > arm frv or i386 CONFIG_HIGHPTE you are leaving the page table atomically > kmapped across read_swap_cache_async(), which (never mind lock ordering) > is quite likely to block waiting to allocate memory. Thanks for pointing out, you're right. walk_pte_range() doesn't fit to pte loop in original swapin_walk_pmd_entry(), so I should not have changed this code. > I do not see > madvise-redefine-callback-functions-for-page-table-walker.patch > as an improvement. I can see what's going on in Shaohua's original > code, whereas this style makes bugs more likely. Please drop it. OK, I agree that. Thanks, Naoya Horiguchi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>