Numa fault is a method which help to achieve auto numa balancing. When such a page fault takes place, the page fault handler will check whether the page is placed correctly. If not, migration should be involved to cut down the distance between the cpu and pages. A pte with _PAGE_NUMA help to implement numa fault. It means not to allow the MMU to access the page directly. So a page fault is triggered and numa fault handler gets the opportunity to run checker. As for the access of MMU, we need special handling for the powernv's guest. When we mark a pte with _PAGE_NUMA, we already call mmu_notifier to invalidate it in guest's htab, but when we tried to re-insert them, we firstly try to map it in real-mode. Only after this fails, we fallback to virt mode, and most of important, we run numa fault handler in virt mode. This patch guards the way of real-mode to ensure that if a pte is marked with _PAGE_NUMA, it will NOT be mapped in real mode, instead, it will be mapped in virt mode and have the opportunity to be checked with placement. Signed-off-by: Liu Ping Fan <pingfank@xxxxxxxxxxxxxxxxxx> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> --- v4: more detail description --- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 1d6c56a..8fcc363 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -234,7 +234,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pte_size = psize; pte = lookup_linux_pte_and_update(pgdir, hva, writing, &pte_size); - if (pte_present(pte)) { + if (pte_present(pte) && !pte_numa(pte)) { if (writing && !pte_write(pte)) /* make the actual HPTE be read-only */ ptel = hpte_make_readonly(ptel); -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html