On Thu, Mar 15, 2012 at 03:45:04PM -0700, Andrew Morton wrote: > Or do we still need pdm_trans_unstable() checking in > mem_cgroup_count_precharge_pte_range() and > mem_cgroup_move_charge_pte_range()? I think we need a pmd_trans_unstable check before the pte_offset_map_lock in both places. Otherwise with only the mmap_sem hold for reading, the pmd may have been transhuge, mem_cgroup_move_charge_pte_range could be called, and then MADV_DONTNEED would transform the pmd to none from another thread just before pmd_trans_huge_lock runs, and we would end up doing pmd_offset_map_lock on a none pmd (or a transhuge pmd if it becomes huge again before we get there). Only if pmd_trans_unstable is false, the pmd can't change from under us, so we can proceed safely with the pte level walk (and it just need to be checked with a compiler barrier, as the real pmd changes freely from under us). pmd_trans_unstable will never actually trigger unless we're hitting the race, if the pmd was none when we started the walk we'd abort at the higher level (method not called), if the pmd was transhuge we'd stop at the pmd_trans_huge_lock() == 1 branch. So the only way to run pmd_trans_unstable is when the result is undefined, i.e. the pmd was not none initially but it become none or transhuge or none again at some point, so we can just simply consider it still none and skip for the undefined case. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>