On Wed, Nov 09, 2011 at 02:25:42AM +0100, Andrea Arcangeli wrote: > Also note, if we find a way to enforce orderings in the prio tree (not > sure if it's possible, apparently it's already using list_add_tail > so..), then we could also remove the i_mmap_lock from mremap and fork. I'm not optimistic we can enforce ordering there. Being a tree it's walked in range order. I thought of another solution that would avoid having to reorder the list in mremap and avoid the i_mmap_mutex to be added to fork (and then we can remove it from mremap too). The solution is to rmap_walk twice. I mean two loops over the same_anon_vma for those rmap walks that must be reliable (that includes two calls of unmap_mapping_range). For both same_anon_vma and prio tree. Reading truncate_pagecache I see two loops already and a comment saying it's for fork(), to avoid leaking ptes in the child. So fork is probably ok already without having to take the i_mmap_mutex, but then I wonder why that also doesn't fix mremap if we do two loops there and why that i_mmap_mutex is really needed in mremap considering those two calls already present in truncate_pagecache. I wonder if that was a "theoretical" fix that missed the fact truncate already walks the prio tree twice, so it doesn't matter if the rmap_walk goes in the opposite direction of move_page_tables? That i_mmap_lock in mremap (now i_mmap_mutex) is there since start of git history. The double loop was introduced in d00806b183152af6d24f46f0c33f14162ca1262a. So it's very possible that i_mmap_mutex is now useless (after d00806b183152af6d24f46f0c33f14162ca1262a) and the fix for fork, was already taking care of mremap too and that i_mmap_mutex can now be removed. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>