On 08/04/14 14:09, Mel Gorman wrote: > David Vrabel identified a regression when using automatic NUMA balancing > under Xen whereby page table entries were getting corrupted due to the > use of native PTE operations. Quoting him > > Xen PV guest page tables require that their entries use machine > addresses if the preset bit (_PAGE_PRESENT) is set, and (for > successful migration) non-present PTEs must use pseudo-physical > addresses. This is because on migration MFNs in present PTEs are > translated to PFNs (canonicalised) so they may be translated back > to the new MFN in the destination domain (uncanonicalised). > > pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma() > set and clear the _PAGE_PRESENT bit using pte_set_flags(), > pte_clear_flags(), etc. > > In a Xen PV guest, these functions must translate MFNs to PFNs > when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting > _PAGE_PRESENT. > > His suggested fix converted p[te|md]_[set|clear]_flags to using > paravirt-friendly ops but this is overkill. He suggested an alternative of > using p[te|md]_modify in the NUMA page table operations but this is does > more work than necessary and would require looking up a VMA for protections. > > This patch modifies the NUMA page table operations to use paravirt friendly > operations to set/clear the flags of interest. Unfortunately this will take > a performance hit when updating the PTEs on CONFIG_PARAVIRT but I do not > see a way around it that does not break Xen. Acked-by: David Vrabel <david.vrabel@xxxxxxxxxx> It passed my mprotect() PROT_NONE -> PROT_READ test case so Tested-by: David Vrabel <david.vrabel@xxxxxxxxxx> I'll leave it up to the x86 maintainers to decide which fix to take. This one or the more generic "x86: use pv-ops in {pte,pmd}_{set,clear}_flags()" David -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>