On Mon, 2008-03-24 at 11:24 -0700, Hiroshi Shimamoto wrote: > Hi Peter, > > I've updated the patch. Could you please review it? > > I'm also thinking that it can be in the mainline because it makes > the lock period shorter, correct? Possibly yeah, Nick, Hugh? > --- > From: Hiroshi Shimamoto <h-shimamoto@xxxxxxxxxxxxx> > > There is a deadlock scenario; remove_mapping() vs free_swap_and_cache(). > remove_mapping() turns PG_nonewrefs bit on, then locks swap_lock. > free_swap_and_cache() locks swap_lock, then wait to turn PG_nonewrefs bit > off in find_get_page(). > > swap_lock can be unlocked before calling find_get_page(). > > In remove_exclusive_swap_page(), there is similar lock sequence; > swap_lock, then PG_nonewrefs bit. swap_lock can be unlocked before > turning PG_nonewrefs bit on. I worry about this, Once we free the swap entry with swap_entry_free(), and drop the swap_lock, another task is basically free to re-use that swap location and try to insert another page in that same spot in add_to_swap() - read_swap_cache_async() can't race because it would mean it still has a swap entry pinned. However, add_to_swap() can already handle the race, because it used to race against read_swap_cache_async(). It also swap_free()s the entry so as to not leak entries. So I think this is indeed correct. [ I ought to find some time to port the concurrent page-cache patches on top of Nick's latest lockless series, Hugh's suggestion makes the speculative get much nicer. ] > Signed-off-by: Hiroshi Shimamoto <h-shimamoto@xxxxxxxxxxxxx> Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> > --- > mm/swapfile.c | 10 ++++++---- > 1 files changed, 6 insertions(+), 4 deletions(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 5036b70..6fbc77e 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -366,6 +366,7 @@ int remove_exclusive_swap_page(struct page *page) > /* Is the only swap cache user the cache itself? */ > retval = 0; > if (p->swap_map[swp_offset(entry)] == 1) { > + spin_unlock(&swap_lock); > /* Recheck the page count with the swapcache lock held.. */ > lock_page_ref_irq(page); > if ((page_count(page) == 2) && !PageWriteback(page)) { > @@ -374,8 +375,8 @@ int remove_exclusive_swap_page(struct page *page) > retval = 1; > } > unlock_page_ref_irq(page); > - } > - spin_unlock(&swap_lock); > + } else > + spin_unlock(&swap_lock); > > if (retval) { > swap_free(entry); > @@ -400,13 +401,14 @@ void free_swap_and_cache(swp_entry_t entry) > p = swap_info_get(entry); > if (p) { > if (swap_entry_free(p, swp_offset(entry)) == 1) { > + spin_unlock(&swap_lock); > page = find_get_page(&swapper_space, entry.val); > if (page && unlikely(TestSetPageLocked(page))) { > page_cache_release(page); > page = NULL; > } > - } > - spin_unlock(&swap_lock); > + } else > + spin_unlock(&swap_lock); > } > if (page) { > int one_user; -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html