Re: [PATCH v3 00/21] KVM: arm64: Rewrite page-table code and fault handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Will,

On 9/3/20 5:34 PM, Gavin Shan wrote:
On 8/25/20 7:39 PM, Will Deacon wrote:
Hello folks,

This is version three of the KVM page-table rework that I previously posted
here:

   v1: https://lore.kernel.org/r/20200730153406.25136-1-will@xxxxxxxxxx
   v2: https://lore.kernel.org/r/20200818132818.16065-1-will@xxxxxxxxxx

Changes since v2 include:

   * Rebased onto -rc2, which includes the conflicting OOM blocking fixes
   * Dropped the patch trying to "fix" the memcache in kvm_phys_addr_ioremap()


It's really nice work, making the code unified/simplified greatly.
However, it seems it doesn't work well with HugeTLBfs. Please refer
to the following test result and see if you have quick idea, or I
can debug it a bit :)


Machine         Host                     Guest              Result
===============================================================
ThunderX2    VA_BITS:   42           PAGE_SIZE:  4KB     Passed
              PAGE_SIZE: 64KB                    64KB     passed
              THP:       disabled
              HugeTLB:   disabled
---------------------------------------------------------------
ThunderX2    VA_BITS:   42           PAGE_SIZE:  4KB     Passed
              PAGE_SIZE: 64KB                    64KB     passed
              THP:       enabled
              HugeTLB:   disabled
----------------------------------------------------------------
ThunderX2    VA_BITS:   42           PAGE_SIZE:  4KB     Fail[1]
              PAGE_SIZE: 64KB                    64KB     Fail[1]
              THP:       disabled
              HugeTLB:   enabled
---------------------------------------------------------------
ThunderX2    VA_BITS:   39           PAGE_SIZE:  4KB     Passed
              PAGE_SIZE: 4KB                     64KB     Passed
              THP:       disabled
              HugeTLB:   disabled
---------------------------------------------------------------
ThunderX2    VA_BITS:   39           PAGE_SIZE:  4KB     Passed
              PAGE_SIZE: 4KB                     64KB     Passed
              THP:       enabled
              HugeTLB:   disabled
--------------------------------------------------------------
ThunderX2    VA_BITS:   39           PAGE_SIZE: 4KB     Fail[2]
              PAGE_SIZE: 4KB                    64KB     Fail[2]
              THP:       disabled
              HugeTLB:   enabled


I debugged the code and found the issue is caused by the following
patch.

[PATCH v3 06/21] KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table

With the following code changes applied on top of this series, no
host kernel crash found and hugetlbfs works for me. However, I don't
think it's correct fix to have. I guess we still want to invalidate
the page table entry (at level#2 when PAGE_SIZE is 64KB on host) in
stage2_map_walk_table_pre() as we're going to cut off the branch to
the subordinate tables/entries. However, stage2_map_walk_table_post()
still need the original page table entry to release the subordinate
page properly. So I guess the proper fix would be to cache the original
page table entry in advance, or you might have better idea :)

I will also reply to PATCH[06/21] to to make the reply chain complete.

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 6e8ca1ec12b4..f4eacfdd73cb 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -494,8 +494,8 @@ static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level,
        if (!kvm_block_mapping_supported(addr, end, data->phys, level))
                return 0;
- kvm_set_invalid_pte(ptep);
-       kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, 0);
+       //kvm_set_invalid_pte(ptep);
+       //kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, 0);
        data->anchor = ptep;
        return 0;
 }

For the initial debugging, I add some printk around and get the following
output, for FYI. It indicates we're releasing page at physical address
0x0 and obviously incorrect.

   [  111.586180] stage2_map_walk_table_post: addr=0x40000000, end=0x60000000, level=2, anchor@0xfffffc0f191c0010, ptep@0xfffffc0f191c0010

   static int stage2_map_walk_table_post(u64 addr, u64 end, u32 level,
        if (!data->anchor)
                return 0;
+ if (*ptep == 0x0) {
+               pr_warn("%s: addr=0x%llx, end=0x%llx, level=%d, anchor@0x%lx, ptep@0x%lx\n",
+                        __func__, addr, end, level, (unsigned long)(data->anchor),
+                       (unsigned long)ptep);
+       }
+
        free_page((unsigned long)kvm_pte_follow(*ptep));
        put_page(virt_to_page(ptep));

By the way, I've finished the code review. I leave those nVHE patches to Alex for his
review. I think the testing is also finished until you need me to have more testing.
With the issue fixed, feel free to add for this series:

Tested-by: Gavin Shan <gshan@xxxxxxxxxx>

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm




[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux