On Thu, Aug 26, 2010 at 4:50 PM, Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote: > On Wed, Aug 25, 2010 at 11:12:54PM -0700, Hugh Dickins wrote: >> After several hours, kbuild tests hang with anon_vma_prepare() spinning on >> a newly allocated anon_vma's lock - on a box with CONFIG_TREE_PREEMPT_RCU=y >> (which makes this very much more likely, but it could happen without). >> >> The ever-subtle page_lock_anon_vma() now needs a further twist: since >> anon_vma_prepare() and anon_vma_fork() are liable to change the ->root >> of a reused anon_vma structure at any moment, page_lock_anon_vma() >> needs to check page_mapped() again before succeeding, otherwise >> page_unlock_anon_vma() might address a different root->lock. > > I don't get it, the anon_vma can be freed and reused only after we run > rcu_read_unlock(). No. Between rcu_read_lock and rcu_read_unlock, once we've done the first (original) page_mapped test to make sure that this isn't just a long-stale page->mapping left over in there, SLAB_DESTROY_BY_RCU ensures that the slab page on which this "struct anon_vma" resides cannot be freed and reused for some other purpose e.g. a page of user data. But that piece of slab holding this "struct anon_vma" is liable to be freed and reused for another struct anon_vma at any point, until we've got the right lock on it. > And the anon_vma->root can't change unless the > anon_vma is freed and reused. Yes, but RCU is not protecting against that: all it's doing is guaranteeing that when we "speculatively" spin_lock the anon_vma, we won't be corrupting some other kind of structure or data. > Last but not the least by the time > page->mapping points to "anon_vma" the "anon_vma->root" is already > initialized and stable. Yes, but two things to be careful of there: one, we leave page->mapping pointing to the anon_vma maybe long after that address has got reused for something else, it's only when the page is finally freed that it's cleared (and there certainly was a good racing reason for that, but I'd have to think long and hard to reconstruct the sequence - OTOH it was a race between page_remove_rmap and page_add_anon_rmap); and two, notice how anon_vma_prepare() sets anon_vma->root = anon_vma on a newly allocated anon_vma, before it gets anon_vma_lock - so anon_vma->root can change underneath us at any point there, until we've got anon_vma_lock _and_ checked stability with a second page_mapped test. > > The page_mapped test is only relevant against the rcu_read_lock, not > the spin_lock, so how it can make a difference to run it twice inside > the same rcu_read_lock protected critical section? The first one still > is valid also after the anon_vma_lock() returns, it's not like that > anon_vma_lock drops the rcu_read_lock internally. > > Furthermore no need of ACCESS_ONCE on the anon_vma->root because it > can't change from under us as the anon_vma can't be freed from under > us until rcu_read_unlock returns (after we verified the first time > that page_mapped is true under the rcu_read_lock, which we already do > before trying to take the anon_vma_lock). I must rush off for a few hours: hopefully my assertions above shed some light., I think you're mistaking the role that RCU plays here. You remark in other mail "About your patch, it's a noop in my view..." - but seems quite an effective noop ;) Without the patch my kbuilds would hang usually within 6 hours - though one time they did manage 20 hours, I admit. They ran with the patch for 52 hours before I switched the machine over to something else. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>