Hi Will, On 8/25/20 7:39 PM, Will Deacon wrote:
Hello folks, This is version three of the KVM page-table rework that I previously posted here: v1: https://lore.kernel.org/r/20200730153406.25136-1-will@xxxxxxxxxx v2: https://lore.kernel.org/r/20200818132818.16065-1-will@xxxxxxxxxx Changes since v2 include: * Rebased onto -rc2, which includes the conflicting OOM blocking fixes * Dropped the patch trying to "fix" the memcache in kvm_phys_addr_ioremap()
It's really nice work, making the code unified/simplified greatly. However, it seems it doesn't work well with HugeTLBfs. Please refer to the following test result and see if you have quick idea, or I can debug it a bit :) Note: I think the failing cases (FAIL[1] and FAIL[2]) would be caused by same issue. Machine Host Guest Result =============================================================== ThunderX2 VA_BITS: 42 PAGE_SIZE: 4KB Passed PAGE_SIZE: 64KB 64KB passed THP: disabled HugeTLB: disabled --------------------------------------------------------------- ThunderX2 VA_BITS: 42 PAGE_SIZE: 4KB Passed PAGE_SIZE: 64KB 64KB passed THP: enabled HugeTLB: disabled ---------------------------------------------------------------- ThunderX2 VA_BITS: 42 PAGE_SIZE: 4KB Fail[1] PAGE_SIZE: 64KB 64KB Fail[1] THP: disabled HugeTLB: enabled --------------------------------------------------------------- ThunderX2 VA_BITS: 39 PAGE_SIZE: 4KB Passed PAGE_SIZE: 4KB 64KB Passed THP: disabled HugeTLB: disabled --------------------------------------------------------------- ThunderX2 VA_BITS: 39 PAGE_SIZE: 4KB Passed PAGE_SIZE: 4KB 64KB Passed THP: enabled HugeTLB: disabled -------------------------------------------------------------- ThunderX2 VA_BITS: 39 PAGE_SIZE: 4KB Fail[2] PAGE_SIZE: 4KB 64KB Fail[2] THP: disabled HugeTLB: enabled NOTE: The commands used to start VM are same for FAIL[1] and FAIL[2] and the host kernel log are similar. So I don't provide the kernel log for FAIL[2]. I guess they're caused by same issue. Fail[1] =============================================================== start_vm_aarch64_hugetlbfs() { echo 16 > /sys/kernel/mm/hugepages/hugepages-524288kB/nr_hugepages /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \ --enable-kvm -machine virt,gic-version=host \ -cpu host -smp 8,sockets=8,cores=1,threads=1 \ -m 4G -mem-prealloc -mem-path /dev/hugepages \ -monitor none -serial mon:stdio -nographic -s \ -bios /home/gavin/sandbox/qemu.main/pc-bios/edk2-aarch64-code.fd \ -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \ -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \ -append "earlycon=pl011,mmio,0x9000000" \ -device virtio-net-pci,netdev=unet,mac=52:54:00:f1:26:a6 \ -netdev user,id=unet,hostfwd=tcp::50959-:22 \ -drive file=/home/gavin/sandbox/images/vm.img,if=none,format=raw,id=nvme0 \ -device nvme,drive=nvme0,serial=foo \ -drive file=/home/gavin/sandbox/images/vm1.img,if=none,format=raw,id=nvme1 \ -device nvme,drive=nvme1,serial=foo1 } [ 160.889802] Unable to handle kernel paging request at virtual address 003fffff7fc00034 [ 160.897712] Mem abort info: [ 160.900507] ESR = 0x96000004 [ 160.903550] EC = 0x25: DABT (current EL), IL = 32 bits [ 160.908848] SET = 0, FnV = 0 [ 160.911896] EA = 0, S1PTW = 0 [ 160.915024] Data abort info: [ 160.917891] ISV = 0, ISS = 0x00000004 [ 160.921722] CM = 0, WnR = 0 [ 160.924678] [003fffff7fc00034] address between user and kernel address ranges [ 160.931808] Internal error: Oops: 96000004 [#1] SMP [ 160.936676] Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc rfkill ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib vfat fat ib_umad rpcrdma sunrpc rdma_ucm ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi ipmi_ssif qedr ib_uverbs crct10dif_ce i2c_smbus ghash_ce sha2_ce sha256_arm64 ib_core sha1_ce ipmi_devintf ipmi_msghandler thunderx2_pmu ip_tables xfs libcrc32c sg ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper qede ttm mpt3sas qed drm raid_class e1000e scsi_transport_sas crc8 gpio_xlp i2c_xlp9xx dm_mirror dm_region_hash dm_log dm_mod [ 161.007565] CPU: 222 PID: 4559 Comm: qemu-system-aar Not tainted 5.9.0-rc3-gavin+ #4 [ 161.015293] Hardware name: Default string MT91-FS1/MT91-FS1, BIOS 28m 12/14/2019 [ 161.022676] pstate: 60400009 (nZCv daif +PAN -UAO BTYPE=--) [ 161.028250] pc : __free_pages+0x24/0x60 [ 161.032074] lr : free_pages.part.102+0x2c/0x38 [ 161.036504] sp : fffffe0031b2f8a0 [ 161.039805] x29: fffffe0031b2f8a0 x28: 0000000040000000 [ 161.045104] x27: fffffe0031b2f9c8 x26: 0000000000000007 [ 161.050402] x25: 0000000000000003 x24: fffffe0010f16000 [ 161.055700] x23: 0000000020000000 x22: 0000000040000000 [ 161.060998] x21: 0000000000000002 x20: 0000000060000000 [ 161.066296] x19: fffffc0f1b050010 x18: 0000000000000000 [ 161.071595] x17: 0000000000000000 x16: 0000000000000000 [ 161.076893] x15: 0000000000000000 x14: 0000000000000000 [ 161.082191] x13: 0000000000000000 x12: 0000000000000001 [ 161.087489] x11: 0000000000000003 x10: 0000000000000002 [ 161.092787] x9 : fffffe001035fca4 x8 : 0000000000000007 [ 161.098085] x7 : 00000000fffffff3 x6 : fffffe0010126370 [ 161.103383] x5 : fffffe0031b2f9e8 x4 : 0000040080000000 [ 161.108681] x3 : 003fffff7fc00034 x2 : 00000000ffffffff [ 161.113979] x1 : 0000000000000000 x0 : 003fffff7fc00000 [ 161.119277] Call trace: [ 161.121713] __free_pages+0x24/0x60 [ 161.125189] free_pages.part.102+0x2c/0x38 [ 161.129272] free_pages+0x1c/0x28 [ 161.132586] stage2_map_walker+0xbc/0x218 [ 161.136584] __kvm_pgtable_walk+0xec/0x1c8 [ 161.140667] _kvm_pgtable_walk+0xa4/0xe0 [ 161.144578] kvm_pgtable_stage2_map+0xa4/0x118 [ 161.149022] kvm_handle_guest_abort+0x48c/0xa08 [ 161.153543] handle_exit+0x134/0x198 [ 161.157107] kvm_arch_vcpu_ioctl_run+0x4f0/0x880 [ 161.161721] kvm_vcpu_ioctl+0x3a8/0x808 [ 161.165546] __arm64_sys_ioctl+0x1dc/0xcf8 [ 161.169642] do_el0_svc+0xf4/0x1b8 [ 161.173039] el0_sync_handler+0xf8/0x124 [ 161.176949] el0_sync+0x140/0x180 [ 161.180254] Code: d503201f 9100d003 52800022 4b0203e2 (b8e20064) [ 161.186408] ---[ end trace d0b1b117875f8fcd ]--- [ 161.191012] Kernel panic - not syncing: Fatal exception [ 161.196247] SMP: stopping secondary CPUs [ 161.200206] Kernel Offset: 0xc0000 from 0xfffffe0010000000 [ 161.205677] PHYS_OFFSET: 0x80000000 [ 161.209154] CPU features: 0x0046002,22800c38 [ 161.213410] Memory Limit: none [ 161.216474] ---[ end Kernel panic - not syncing: Fatal exception ]--- FAIL[2] ================================================================ start_vm_aarch64_hugetlbfs() { echo 4096 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \ --enable-kvm -machine virt,gic-version=host \ -cpu host -smp 8,sockets=8,cores=1,threads=1 \ -m 4G -mem-prealloc -mem-path /dev/hugepages \ -monitor none -serial mon:stdio -nographic -s \ -bios /home/gavin/sandbox/qemu.main/pc-bios/edk2-aarch64-code.fd \ -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \ -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \ -append "earlycon=pl011,mmio,0x9000000" \ -device virtio-net-pci,netdev=unet,mac=52:54:00:f1:26:a6 \ -netdev user,id=unet,hostfwd=tcp::50959-:22 \ -drive file=/home/gavin/sandbox/images/vm.img,if=none,format=raw,id=nvme0 \ -device nvme,drive=nvme0,serial=foo \ -drive file=/home/gavin/sandbox/images/vm1.img,if=none,format=raw,id=nvme1 \ -device nvme,drive=nvme1,serial=foo1 } [ 666.278391] Unable to handle kernel paging request at virtual address 03fffffefde00034 [ 666.286304] Mem abort info: [ 666.289086] ESR = 0x96000004 [ 666.292142] EC = 0x25: DABT (current EL), IL = 32 bits [ 666.297440] SET = 0, FnV = 0 [ 666.300481] EA = 0, S1PTW = 0 [ 666.303616] Data abort info: [ 666.306484] ISV = 0, ISS = 0x00000004 [ 666.310306] CM = 0, WnR = 0 [ 666.313269] [03fffffefde00034] address between user and kernel address ranges [ 666.320393] Internal error: Oops: 96000004 [#1] SMP [ 666.325259] Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc rfkill ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_umad vfat fat rpcrdma sunrpc rdma_ucm ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi qedr ib_uverbs ipmi_ssif ib_core crct10dif_ce i2c_smbus ghash_ce sha2_ce sha256_arm64 sha1_ce ipmi_devintf ipmi_msghandler thunderx2_pmu ip_tables xfs libcrc32c sg ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper qede ttm qed mpt3sas drm e1000e raid_class crc8 scsi_transport_sas gpio_xlp i2c_xlp9xx dm_mirror dm_region_hash dm_log dm_mod [ 666.396150] CPU: 168 PID: 42112 Comm: qemu-system-aar Not tainted 5.9.0-rc3-gavin+ #5 [ 666.403965] Hardware name: Default string MT91-FS1/MT91-FS1, BIOS 28m 12/14/2019 [ 666.411348] pstate: 60400009 (nZCv daif +PAN -UAO BTYPE=--) [ 666.416922] pc : __free_pages+0x24/0x60 [ 666.420746] lr : free_pages.part.102+0x2c/0x38 [ 666.425176] sp : ffffffc024a23840 [ 666.428477] x29: ffffffc024a23840 x28: 0000000040000000 [ 666.433776] x27: ffffffc024a239c8 x26: 0000000000000007 [ 666.439074] x25: 0000000000000003 x24: ffffffc010ec0000 [ 666.444373] x23: 0000000000200000 x22: 0000000040200000 [ 666.449671] x21: 0000000040000000 x20: ffffff8f34576000 [ 666.454969] x19: 0000000000000002 x18: 0000000000000000 [ 666.460267] x17: 0000000000000000 x16: 0000000000000000 [ 666.465565] x15: 0000000000000000 x14: 0000000000000000 [ 666.470863] x13: 0000000000000000 x12: 0000000000000001 [ 666.476161] x11: 0000000000000003 x10: 0000000000000002 [ 666.481459] x9 : ffffffc0103522f4 x8 : 0000000000000007 [ 666.486757] x7 : ffffffc0249960f8 x6 : ffffffc0101162f8 [ 666.492055] x5 : ffffffc024a239e8 x4 : 0000000000000004 [ 666.497353] x3 : 03fffffefde00034 x2 : 00000000ffffffff [ 666.502651] x1 : 0000000000000000 x0 : 03fffffefde00000 [ 666.507950] Call trace: [ 666.510385] __free_pages+0x24/0x60 [ 666.513861] free_pages.part.102+0x2c/0x38 [ 666.517945] free_pages+0x1c/0x28 [ 666.521260] stage2_map_walker+0xb0/0x208 [ 666.525257] __kvm_pgtable_walk+0xe0/0x1b8 [ 666.529340] __kvm_pgtable_walk+0xb8/0x1b8 [ 666.533424] _kvm_pgtable_walk+0xa4/0xe0 [ 666.537334] kvm_pgtable_stage2_map+0xa0/0x118 [ 666.541779] kvm_handle_guest_abort+0x48c/0xa38 [ 666.546300] handle_exit+0x134/0x198 [ 666.549864] kvm_arch_vcpu_ioctl_run+0x4f0/0x880 [ 666.554479] kvm_vcpu_ioctl+0x3a8/0x808 [ 666.558304] __arm64_sys_ioctl+0x1dc/0xcf8 [ 666.562402] do_el0_svc+0xf4/0x1b8 [ 666.565799] el0_sync_handler+0xf8/0x124 [ 666.569709] el0_sync+0x140/0x180 [ 666.573014] Code: d503201f 9100d003 52800022 4b0203e2 (b8e20064) [ 666.579197] ---[ end trace 52b60e2f408396b6 ]--- [ 666.583801] Kernel panic - not syncing: Fatal exception [ 666.589035] SMP: stopping secondary CPUs [ 666.592996] Kernel Offset: 0xb0000 from 0xffffffc010000000 [ 666.598467] PHYS_OFFSET: 0x80000000 [ 666.601944] CPU features: 0x0046002,22800c38 [ 666.606200] Memory Limit: none [ 666.609264] ---[ end Kernel panic - not syncing: Fatal exception ]--- Thanks, Gavin _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm