On Sat, 2011-05-28 at 13:20 -0700, Hugh Dickins wrote: > On one machine I've been getting hangs, a page fault's anon_vma_prepare() > waiting in anon_vma_lock(), other processes waiting for that page's lock. > > This is a replay of last year's f18194275c39 > "mm: fix hang on anon_vma->root->lock". > > The new page_lock_anon_vma() places too much faith in its refcount: when > it has acquired the mutex_trylock(), it's possible that a racing task in > anon_vma_alloc() has just reallocated the struct anon_vma, set refcount > to 1, and is about to reset its anon_vma->root. > > Fix this by saving anon_vma->root, and relying on the usual page_mapped() > check instead of a refcount check: if page is still mapped, the anon_vma > is still ours; if page is not still mapped, we're no longer interested. Interesting race.. but can we guarantee that the page didn't get remapped meanwhile? The updated comment by page_get_anon_vma() describes the lack of serialization against page_remove_rmap() but fails to mention the page_add_anon_rmap cases (bad me, I know I checked at the time, but can't for the life of me remember what it was now). _IFF_ we are serialized, your patch should suffice, since then page_mapped() implies a >0 refcount, if not however, I think we need both tests since in that case the page might be mapped again against a different anon_vma and our current anon_vma (the one we locked against) might have refcount == 0 and already be past the mutex_is_locked() test in anon_vma_free(), at which point we're up shit creek since then the anon_vma we're returning can disappear the moment we do rcu_read_unlock(). Or am I delusional due to lack of sleep? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>