Re: Potential bug in TDP MMU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have managed to reliably reproduce the issue on a QEMU VM (on a host
with nested virtualisation enabled). Here are the steps:

1. Install gvisor as per
https://gvisor.dev/docs/user_guide/install/#install-latest
2. Run
$ for i in $(seq 1 100); do sudo runsc --platform=kvm --network=none
do echo ok; done

I've tried to recompile the kernel with the above patch, but
unfortunately it does fix the issue. I'm happy to try other
patches/fixes queued for 5.16-rc4

Ignat

On Tue, Nov 30, 2021 at 9:29 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On 11/29/21 22:44, Ignat Korchagin wrote:
> > Hello,
> >
> > We have recently started to evaluate 5.15.y kernel series and here is
> > what we occasionally get on kernel 5.15.5:
>
> I'm not sure if it's this, but there are quite a few fixes I've queued
> for 5.16-rc4, and I'll be sending a pull request to Linus shortly.  So
> we can revisit this in a week.
>
> This is the most likely fix:
>
> https://patchwork.kernel.org/project/kvm/patch/20211120045046.3940942-2-seanjc@xxxxxxxxxx/
>
> Paolo
>
> > [177243.621744][T1995658] WARNING: CPU: 7 PID: 1995658 at
> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:171
> > kvm_is_zone_device_pfn.part.0+0x9e/0xd0 [kvm]
> > [177243.647435][T1995658] Modules linked in: xt_hashlimit xt_connlimit
> > nf_conncount ip_set_hash_netport xt_length esp4 sit ipip tunnel4
> > nft_numgen nft_ct ip_gre gre xfrm_user xfrm_algo tcp_diag udp_diag
> > inet_diag fou6 fou ip6_tunnel tunnel6 ip_tunnel ip6_udp_tunnel
> > udp_tunnel cls_bpf tls xt_NFLOG xt_statistic nft_compat veth tun
> > overlay macvlan sch_ingress raid0 md_mod essiv dm_crypt trusted
> > asn1_encoder tee dm_mod dax nfnetlink_log nft_log nft_limit
> > nft_counter nf_tables ip6table_nat ip6table_mangle ip6table_security
> > ip6table_raw ip6table_filter ip6_tables xt_nat iptable_nat nf_nat
> > xt_TCPMSS xt_u32 xt_connmark iptable_mangle xt_owner xt_CT iptable_raw
> > xt_state xt_bpf xt_mark xt_conntrack xt_multiport xt_comment xt_tcpudp
> > xt_set xt_tcpmss iptable_filter ip_set_hash_net ip_set_hash_ip ip_set
> > nfnetlink sch_fq tcp_bbr nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
> > 8021q garp stp mrp llc skx_edac x86_pkg_temp_thermal kvm_intel kvm
> > irqbypass crc32_pclmul crc32c_intel aesni_intel rapl
> > [177243.647594][T1995658]  intel_cstate ipmi_ssif sfc intel_uncore
> > i2c_i801 xhci_pci i2c_smbus acpi_ipmi i40e mdio ioatdma i2c_core
> > xhci_hcd tpm_crb dca ipmi_si ipmi_devintf ipmi_msghandler tpm_tis
> > tpm_tis_core tpm tiny_power_button button fuse efivarfs ip_tables
> > x_tables bcmcrypt(O) crypto_simd cryptd [last unloaded: kheaders]
> > [177243.831600][T1995658] CPU: 7 PID: 1995658 Comm: exe Tainted: G
> >        O      5.15.5-cloudflare-kasan-2021.11.11 #1
> > [177243.831609][T1995658] Hardware name: Quanta Cloud Technology Inc.
> > QuantaPlex T42S-2U/T42S-2U MB (Lewisburg-4), BIOS 3B16.Q102 02/19/2020
> > [177243.831612][T1995658] RIP:
> > 0010:kvm_is_zone_device_pfn.part.0+0x9e/0xd0 [kvm]
> > [177243.886990][T1995658] Code: 00 00 00 00 fc ff df 48 c1 ea 03 0f b6
> > 14 02 48 89 e8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2 75 0f 8b 43 34 85
> > c0 74 03 5b 5d c3 <0f> 0b 5b 5d c3 48 89 ef e8 c5 68 72 e1 eb e7 e8 ce
> > 68 72 e1 eb 9b
> > [177243.919270][T1995658] RSP: 0018:ffff8881ec51f300 EFLAGS: 00010246
> > [177243.919276][T1995658] RAX: 0000000000000000 RBX: ffffea003d52e900
> > RCX: ffffffffc143242e
> > [177243.919279][T1995658] RDX: 0000000000000000 RSI: 0000000000000004
> > RDI: ffffea003d52e934
> > [177243.960080][T1995658] RBP: ffffea003d52e934 R08: 0000000000000000
> > R09: ffffea003d52e937
> > [177243.960083][T1995658] R10: fffff94007aa5d26 R11: 0000000000000000
> > R12: ffff88b03ffd9008
> > [177243.960085][T1995658] R13: 0600000f54ba4b77 R14: 0000000000000001
> > R15: 0600000f54ba4b01
> > [177243.960088][T1995658] FS:  0000000000000000(0000)
> > GS:ffff8897a9b80000(0000) knlGS:0000000000000000
> > [177244.017687][T1995658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [177244.017691][T1995658] CR2: 000000c001503000 CR3: 0000001a1fb38006
> > CR4: 00000000007726e0
> > [177244.017693][T1995658] DR0: 0000000000000000 DR1: 0000000000000000
> > DR2: 0000000000000000
> > [177244.058771][T1995658] DR3: 0000000000000000 DR6: 00000000fffe0ff0
> > DR7: 0000000000000400
> > [177244.058774][T1995658] PKRU: 00000000
> > [177244.058776][T1995658] Call Trace:
> > [177244.058780][T1995658]  <TASK>
> > [177244.058782][T1995658]  kvm_set_pfn_dirty+0x120/0x1d0 [kvm]
> > [177244.111155][T1995658]  __handle_changed_spte+0x92e/0xca0 [kvm]
> > [177244.111274][T1995658]  ? alloc_tdp_mmu_page+0x370/0x370 [kvm]
> > [177244.133938][T1995658]  ? sched_clock_cpu+0x15/0x190
> > [177244.144289][T1995658]  ? _raw_spin_lock+0xc8/0xd0
> > [177244.144299][T1995658]  __handle_changed_spte+0x63c/0xca0 [kvm]
> > [177244.165796][T1995658]  ? alloc_tdp_mmu_page+0x370/0x370 [kvm]
> > [177244.165906][T1995658]  ? kvm_vcpu_release+0x4d/0x70 [kvm]
> > [177244.187769][T1995658]  ? deref_stack_reg+0xe6/0x160
> > [177244.187779][T1995658]  __handle_changed_spte+0x63c/0xca0 [kvm]
> > [177244.209044][T1995658]  ? alloc_tdp_mmu_page+0x370/0x370 [kvm]
> > [177244.220140][T1995658]  ? update_curr+0x18d/0x5f0
> > [177244.220148][T1995658]  __handle_changed_spte+0x63c/0xca0 [kvm]
> > [177244.240784][T1995658]  ? alloc_tdp_mmu_page+0x370/0x370 [kvm]
> > [177244.251827][T1995658]  zap_gfn_range+0x549/0x620 [kvm]
> > [177244.251922][T1995658]  ? zap_collapsible_spte_range+0x520/0x520 [kvm]
> > [177244.273410][T1995658]  ? smp_call_function_single+0x271/0x370
> > [177244.283892][T1995658]  ? _raw_spin_lock+0x81/0xd0
> > [177244.283900][T1995658]  ? _raw_spin_lock_bh+0xe0/0xe0
> > [177244.283904][T1995658]  ? vmx_vcpu_pi_load+0x14c/0x3b0 [kvm_intel]
> > [177244.313033][T1995658]  kvm_tdp_mmu_put_root+0x1b6/0x270 [kvm]
> > [177244.323097][T1995658]  mmu_free_root_page+0x219/0x2c0 [kvm]
> > [177244.332862][T1995658]  ? ept_invlpg+0x780/0x780 [kvm]
> > [177244.341982][T1995658]  ? _raw_spin_lock+0xd0/0xd0
> > [177244.350556][T1995658]  ? handle_pause+0x250/0x250 [kvm_intel]
> > [177244.360080][T1995658]  kvm_mmu_free_roots+0x1b4/0x4e0 [kvm]
> > [177244.369396][T1995658]  ? mmu_free_root_page+0x2c0/0x2c0 [kvm]
> > [177244.378777][T1995658]  ?
> > kvm_clear_async_pf_completion_queue+0x2f/0x510 [kvm]
> > [177244.389571][T1995658]  kvm_mmu_unload+0x1c/0xa0 [kvm]
> > [177244.398156][T1995658]  kvm_arch_destroy_vm+0x1f2/0x5c0 [kvm]
> > [177244.398238][T1995658]  ? mmu_notifier_unregister+0x26f/0x330
> > [177244.416160][T1995658]  kvm_put_kvm+0x3b1/0x8b0 [kvm]
> > [177244.424372][T1995658]  kvm_vcpu_release+0x4e/0x70 [kvm]
> > [177244.432743][T1995658]  __fput+0x1f7/0x8c0
> > [177244.432749][T1995658]  task_work_run+0xf8/0x1a0
> > [177244.447257][T1995658]  do_exit+0x97b/0x2230
> > [177244.447263][T1995658]  ? _raw_write_lock_irqsave+0xe0/0xe0
> > [177244.462773][T1995658]  ? mm_update_next_owner+0x750/0x750
> > [177244.471117][T1995658]  ? _raw_spin_lock_irq+0x82/0xd0
> > [177244.471122][T1995658]  do_group_exit+0xda/0x2a0
> > [177244.471126][T1995658]  get_signal+0x3be/0x1e50
> > [177244.471133][T1995658]  arch_do_signal_or_restart+0x244/0x17f0
> > [177244.502424][T1995658]  ? audit_log_exit+0x2690/0x2690
> > [177244.502432][T1995658]  ? shmem_evict_inode+0xad0/0xad0
> > [177244.518403][T1995658]  ? get_sigframe_size+0x10/0x10
> > [177244.526157][T1995658]  ? __seccomp_filter+0x117/0xd60
> > [177244.526162][T1995658]  ? audit_alloc_name+0x440/0x440
> > [177244.526166][T1995658]  ? get_nth_filter.part.0+0x220/0x220
> > [177244.526170][T1995658]  ? __audit_syscall_exit+0x794/0xa80
> > [177244.558471][T1995658]  exit_to_user_mode_prepare+0xcb/0x120
> > [177244.558478][T1995658]  syscall_exit_to_user_mode+0x1d/0x40
> > [177244.575193][T1995658]  do_syscall_64+0x4d/0x90
> > [177244.575197][T1995658]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [177244.591189][T1995658] RIP: 0033:0x4890ca
> > [177244.591199][T1995658] Code: Unable to access opcode bytes at RIP 0x4890a0.
> > [177244.591201][T1995658] RSP: 002b:000000c000508608 EFLAGS: 00000216
> > ORIG_RAX: 000000000000011d
> > [177244.607705][T1995658] RAX: 0000000000000000 RBX: 000000c00003e000
> > RCX: 00000000004890ca
> > [177244.630001][T1995658] RDX: 0000000036466000 RSI: 0000000000000003
> > RDI: 0000000000000010
> > [177244.630003][T1995658] RBP: 000000c000508660 R08: 0000000000000000
> > R09: 0000000000000000
> > [177244.630005][T1995658] R10: 000000000676b000 R11: 0000000000000216
> > R12: 0000000000000009
> > [177244.630007][T1995658] R13: 000000c00129d8b8 R14: 0000000000000001
> > R15: 0000000000000000
> > [177244.630011][T1995658]  </TASK>
> > [177244.679411][T1995658] ---[ end trace bf693a2532f213e4 ]---
> >
> > Then immediately this KASAN warning:
> >
> > [177244.798046][T1995658] BUG: KASAN: slab-out-of-bounds in
> > workingset_activation+0x2b2/0x2f0
> > [177244.809161][T1995658] Read of size 8 at addr ffff8881749ab3b8 by
> > task exe/1995658
> > [177244.819636][T1995658]
> > [177244.824947][T1995658] CPU: 7 PID: 1995658 Comm: exe Tainted: G
> >     W  O      5.15.5-cloudflare-kasan-2021.11.11 #1
> > [177244.838672][T1995658] Hardware name: Quanta Cloud Technology Inc.
> > QuantaPlex T42S-2U/T42S-2U MB (Lewisburg-4), BIOS 3B16.Q102 02/19/2020
> > [177244.857210][T1995658] Call Trace:
> > [177244.863733][T1995658]  <TASK>
> > [177244.869871][T1995658]  dump_stack_lvl+0x34/0x44
> > [177244.877583][T1995658]  print_address_description.constprop.0+0x1f/0x140
> > [177244.887430][T1995658]  ? workingset_activation+0x2b2/0x2f0
> > [177244.896156][T1995658]  ? workingset_activation+0x2b2/0x2f0
> > [177244.904809][T1995658]  kasan_report.cold+0x83/0xdf
> > [177244.912778][T1995658]  ? kvm_is_zone_device_pfn.part.0+0x40/0xd0 [kvm]
> > [177244.922553][T1995658]  ? workingset_activation+0x2b2/0x2f0
> > [177244.931236][T1995658]  workingset_activation+0x2b2/0x2f0
> > [177244.939785][T1995658]  mark_page_accessed+0x44a/0x6a0
> > [177244.948022][T1995658]  __handle_changed_spte+0x64c/0xca0 [kvm]
> > [177244.957192][T1995658]  ? alloc_tdp_mmu_page+0x370/0x370 [kvm]
> > [177244.966249][T1995658]  ? kvm_vcpu_release+0x4d/0x70 [kvm]
> > [177244.974930][T1995658]  ? deref_stack_reg+0xe6/0x160
> > [177244.983013][T1995658]  __handle_changed_spte+0x63c/0xca0 [kvm]
> > [177244.992167][T1995658]  ? alloc_tdp_mmu_page+0x370/0x370 [kvm]
> > [177245.001249][T1995658]  ? update_curr+0x18d/0x5f0
> > [177245.009268][T1995658]  __handle_changed_spte+0x63c/0xca0 [kvm]
> > [177245.018454][T1995658]  ? alloc_tdp_mmu_page+0x370/0x370 [kvm]
> > [177245.027556][T1995658]  zap_gfn_range+0x549/0x620 [kvm]
> > [177245.036055][T1995658]  ? zap_collapsible_spte_range+0x520/0x520 [kvm]
> > [177245.045907][T1995658]  ? smp_call_function_single+0x271/0x370
> > [177245.054992][T1995658]  ? _raw_spin_lock+0x81/0xd0
> > [177245.063045][T1995658]  ? _raw_spin_lock_bh+0xe0/0xe0
> > [177245.071332][T1995658]  ? vmx_vcpu_pi_load+0x14c/0x3b0 [kvm_intel]
> > [177245.080861][T1995658]  kvm_tdp_mmu_put_root+0x1b6/0x270 [kvm]
> > [177245.090147][T1995658]  mmu_free_root_page+0x219/0x2c0 [kvm]
> > [177245.099215][T1995658]  ? ept_invlpg+0x780/0x780 [kvm]
> > [177245.107719][T1995658]  ? _raw_spin_lock+0xd0/0xd0
> > [177245.115776][T1995658]  ? handle_pause+0x250/0x250 [kvm_intel]
> > [177245.124919][T1995658]  kvm_mmu_free_roots+0x1b4/0x4e0 [kvm]
> > [177245.133974][T1995658]  ? mmu_free_root_page+0x2c0/0x2c0 [kvm]
> > [177245.143143][T1995658]  ?
> > kvm_clear_async_pf_completion_queue+0x2f/0x510 [kvm]
> > [177245.153754][T1995658]  kvm_mmu_unload+0x1c/0xa0 [kvm]
> > [177245.162289][T1995658]  kvm_arch_destroy_vm+0x1f2/0x5c0 [kvm]
> > [177245.171426][T1995658]  ? mmu_notifier_unregister+0x26f/0x330
> > [177245.180637][T1995658]  kvm_put_kvm+0x3b1/0x8b0 [kvm]
> > [177245.189087][T1995658]  kvm_vcpu_release+0x4e/0x70 [kvm]
> > [177245.197744][T1995658]  __fput+0x1f7/0x8c0
> > [177245.205096][T1995658]  task_work_run+0xf8/0x1a0
> > [177245.212916][T1995658]  do_exit+0x97b/0x2230
> > [177245.220332][T1995658]  ? _raw_write_lock_irqsave+0xe0/0xe0
> > [177245.229016][T1995658]  ? mm_update_next_owner+0x750/0x750
> > [177245.237583][T1995658]  ? _raw_spin_lock_irq+0x82/0xd0
> > [177245.245731][T1995658]  do_group_exit+0xda/0x2a0
> > [177245.253265][T1995658]  get_signal+0x3be/0x1e50
> > [177245.260585][T1995658]  arch_do_signal_or_restart+0x244/0x17f0
> > [177245.269239][T1995658]  ? audit_log_exit+0x2690/0x2690
> > [177245.277218][T1995658]  ? shmem_evict_inode+0xad0/0xad0
> > [177245.285295][T1995658]  ? get_sigframe_size+0x10/0x10
> > [177245.293162][T1995658]  ? __seccomp_filter+0x117/0xd60
> > [177245.301157][T1995658]  ? audit_alloc_name+0x440/0x440
> > [177245.309224][T1995658]  ? get_nth_filter.part.0+0x220/0x220
> > [177245.317678][T1995658]  ? __audit_syscall_exit+0x794/0xa80
> > [177245.325968][T1995658]  exit_to_user_mode_prepare+0xcb/0x120
> > [177245.334463][T1995658]  syscall_exit_to_user_mode+0x1d/0x40
> > [177245.342837][T1995658]  do_syscall_64+0x4d/0x90
> > [177245.350176][T1995658]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [177245.358991][T1995658] RIP: 0033:0x4890ca
> > [177245.365778][T1995658] Code: Unable to access opcode bytes at RIP 0x4890a0.
> > [177245.375596][T1995658] RSP: 002b:000000c000508608 EFLAGS: 00000216
> > ORIG_RAX: 000000000000011d
> > [177245.387085][T1995658] RAX: 0000000000000000 RBX: 000000c00003e000
> > RCX: 00000000004890ca
> > [177245.387092][T1995658] RDX: 0000000036466000 RSI: 0000000000000003
> > RDI: 0000000000000010
> > [177245.387096][T1995658] RBP: 000000c000508660 R08: 0000000000000000
> > R09: 0000000000000000
> > [177245.387100][T1995658] R10: 000000000676b000 R11: 0000000000000216
> > R12: 0000000000000009
> > [177245.387103][T1995658] R13: 000000c00129d8b8 R14: 0000000000000001
> > R15: 0000000000000000
> > [177245.441910][T1995658]  </TASK>
> > [177245.447826][T1995658]
> > [177245.447827][T1995658] Allocated by task 182586:
> > [177245.447830][T1995658]  kasan_save_stack+0x20/0x50
> > [177245.467811][T1995658]  __kasan_kmalloc+0xa4/0xd0
> > [177245.475233][T1995658]  fib6_info_alloc+0xa2/0x1d0
> > [177245.475240][T1995658]  ip6_route_info_create+0x29f/0x1a30
> > [177245.475243][T1995658]  ip6_route_add+0x18/0x100
> > [177245.475246][T1995658]  addrconf_add_mroute+0x157/0x1b0
> > [177245.506016][T1995658]  addrconf_notify+0x6a3/0x1510
> > [177245.506021][T1995658]  notifier_call_chain+0x9e/0x180
> > [177245.521427][T1995658]  __dev_notify_flags+0xda/0x230
> > [177245.529175][T1995658]  rtnl_configure_link+0x125/0x200
> > [177245.529181][T1995658]  __rtnl_newlink+0xd3d/0x13f0
> > [177245.529186][T1995658]  rtnl_newlink+0x5f/0x90
> > [177245.551464][T1995658]  rtnetlink_rcv_msg+0x378/0xa40
> > [177245.551469][T1995658]  netlink_rcv_skb+0x125/0x380
> > [177245.551472][T1995658]  netlink_unicast+0x4d0/0x7a0
> > [177245.573756][T1995658]  netlink_sendmsg+0x724/0xc00
> > [177245.573760][T1995658]  sock_sendmsg+0xe2/0x110
> > [177245.573764][T1995658]  __sys_sendto+0x1a8/0x270
> > [177245.595271][T1995658]  __x64_sys_sendto+0xdd/0x1b0
> > [177245.602600][T1995658]  do_syscall_64+0x40/0x90
> > [177245.609544][T1995658]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [177245.609549][T1995658]
> > [177245.609550][T1995658] The buggy address belongs to the object at
> > ffff8881749ab200
> > [177245.609550][T1995658]  which belongs to the cache kmalloc-256 of size 256
> > [177245.609554][T1995658] The buggy address is located 184 bytes to the right of
> > [177245.609554][T1995658]  256-byte region [ffff8881749ab200, ffff8881749ab300)
> > [177245.609558][T1995658] The buggy address belongs to the page:
> > [177245.609559][T1995658] page:000000009030f8e1 refcount:1 mapcount:0
> > mapping:0000000000000000 index:0x0 pfn:0x1749a8
> > [177245.682919][T1995658] head:000000009030f8e1 order:2
> > compound_mapcount:0 compound_pincount:0
> > [177245.682924][T1995658] flags:
> > 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
> > [177245.705180][T1995658] raw: 002ffff800010200 dead000000000100
> > dead000000000122 ffff88810004cb40
> > [177245.716738][T1995658] raw: 0000000000000000 0000000000200020
> > 00000001ffffffff 0000000000000000
> > [177245.716741][T1995658] page dumped because: kasan: bad access detected
> > [177245.716743][T1995658]
> > [177245.716744][T1995658] Memory state around the buggy address:
> > [177245.716747][T1995658]  ffff8881749ab280: 00 00 00 00 00 00 00 00
> > 00 00 00 00 00 00 00 00
> > [177245.716749][T1995658]  ffff8881749ab300: fc fc fc fc fc fc fc fc
> > fc fc fc fc fc fc fc fc
> > [177245.716752][T1995658] >ffff8881749ab380: fc fc fc fc fc fc fc fc
> > fc fc fc fc fc fc fc fc
> > [177245.785224][T1995658]                                         ^
> > [177245.785229][T1995658]  ffff8881749ab400: 00 00 00 00 00 00 00 00
> > 00 00 00 00 00 00 00 00
> > [177245.785231][T1995658]  ffff8881749ab480: 00 00 00 00 00 00 00 00
> > 00 00 00 00 00 00 fc fc
> > [177245.816146][T1995658]
> > ==================================================================
> >
> > And after that:
> >
> > [177245.816196][T1995658] general protection fault, probably for
> > non-canonical address 0xdffffc0000000011: 0000 [#1] SMP KASAN PTI
> > [177245.854054][T1995658] KASAN: null-ptr-deref in range
> > [0x0000000000000088-0x000000000000008f]
> > [177245.865836][T1995658] CPU: 7 PID: 1995658 Comm: exe Tainted: G
> > B   W  O      5.15.5-cloudflare-kasan-2021.11.11 #1
> > [177245.865842][T1995658] Hardware name: Quanta Cloud Technology Inc.
> > QuantaPlex T42S-2U/T42S-2U MB (Lewisburg-4), BIOS 3B16.Q102 02/19/2020
> > [177245.865845][T1995658] RIP: 0010:workingset_activation+0x175/0x2f0
> > [177245.909096][T1995658] Code: 80 3c 02 00 0f 85 58 01 00 00 4a 8b ac
> > ed b8 0f 00 00 48 8d bd 88 00 00 00 48 b8 00 00 00 00 00 fc ff df 48
> > 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 0d 01 00 00 4c 3b a5 88 00 00 00
> > 0f 85 c3 00 00
> > [177245.936452][T1995658] RSP: 0018:ffff8881ec51f3c0 EFLAGS: 00010206
> > [177245.936457][T1995658] RAX: dffffc0000000000 RBX: ffffea003237e480
> > RCX: ffffffffa25388a6
> > [177245.936460][T1995658] RDX: 0000000000000011 RSI: 0000000000000246
> > RDI: 0000000000000088
> > [177245.970834][T1995658] RBP: 0000000000000000 R08: 0000000000000001
> > R09: ffffffffa6ced907
> > [177245.970837][T1995658] R10: fffffbfff4d9db20 R11: 0000000000000010
> > R12: ffff88983ffde000
> > [177245.970839][T1995658] R13: 0000000000000000 R14: ffff8897a9bbbde0
> > R15: 0600000c8df93b77
> > [177246.007059][T1995658] FS:  0000000000000000(0000)
> > GS:ffff8897a9b80000(0000) knlGS:0000000000000000
> > [177246.020326][T1995658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [177246.020330][T1995658] CR2: 000000c001503000 CR3: 0000001a1fb38006
> > CR4: 00000000007726e0
> > [177246.020332][T1995658] DR0: 0000000000000000 DR1: 0000000000000000
> > DR2: 0000000000000000
> > [177246.020334][T1995658] DR3: 0000000000000000 DR6: 00000000fffe0ff0
> > DR7: 0000000000000400
> > [177246.020337][T1995658] PKRU: 00000000
> > [177246.020338][T1995658] Call Trace:
> > [177246.020341][T1995658]  <TASK>
> > [177246.090639][T1995658]  mark_page_accessed+0x44a/0x6a0
> > [177246.099947][T1995658]  __handle_changed_spte+0x64c/0xca0 [kvm]
> > [177246.110177][T1995658]  ? alloc_tdp_mmu_page+0x370/0x370 [kvm]
> > [177246.110274][T1995658]  ? kvm_vcpu_release+0x4d/0x70 [kvm]
> > [177246.129880][T1995658]  ? deref_stack_reg+0xe6/0x160
> > [177246.138925][T1995658]  __handle_changed_spte+0x63c/0xca0 [kvm]
> > [177246.139020][T1995658]  ? alloc_tdp_mmu_page+0x370/0x370 [kvm]
> > [177246.158926][T1995658]  ? update_curr+0x18d/0x5f0
> > [177246.167666][T1995658]  __handle_changed_spte+0x63c/0xca0 [kvm]
> > [177246.167762][T1995658]  ? alloc_tdp_mmu_page+0x370/0x370 [kvm]
> > [177246.187453][T1995658]  zap_gfn_range+0x549/0x620 [kvm]
> > [177246.196673][T1995658]  ? zap_collapsible_spte_range+0x520/0x520 [kvm]
> > [177246.196781][T1995658]  ? smp_call_function_single+0x271/0x370
> > [177246.196789][T1995658]  ? _raw_spin_lock+0x81/0xd0
> > [177246.196795][T1995658]  ? _raw_spin_lock_bh+0xe0/0xe0
> > [177246.234516][T1995658]  ? vmx_vcpu_pi_load+0x14c/0x3b0 [kvm_intel]
> > [177246.244565][T1995658]  kvm_tdp_mmu_put_root+0x1b6/0x270 [kvm]
> > [177246.244680][T1995658]  mmu_free_root_page+0x219/0x2c0 [kvm]
> > [177246.263832][T1995658]  ? ept_invlpg+0x780/0x780 [kvm]
> > [177246.263923][T1995658]  ? _raw_spin_lock+0xd0/0xd0
> > [177246.281421][T1995658]  ? handle_pause+0x250/0x250 [kvm_intel]
> > [177246.291062][T1995658]  kvm_mmu_free_roots+0x1b4/0x4e0 [kvm]
> > [177246.291156][T1995658]  ? mmu_free_root_page+0x2c0/0x2c0 [kvm]
> > [177246.310225][T1995658]  ?
> > kvm_clear_async_pf_completion_queue+0x2f/0x510 [kvm]
> > [177246.310297][T1995658]  kvm_mmu_unload+0x1c/0xa0 [kvm]
> > [177246.330051][T1995658]  kvm_arch_destroy_vm+0x1f2/0x5c0 [kvm]
> > [177246.339422][T1995658]  ? mmu_notifier_unregister+0x26f/0x330
> > [177246.339434][T1995658]  kvm_put_kvm+0x3b1/0x8b0 [kvm]
> > [177246.357081][T1995658]  kvm_vcpu_release+0x4e/0x70 [kvm]
> > [177246.365745][T1995658]  __fput+0x1f7/0x8c0
> > [177246.365751][T1995658]  task_work_run+0xf8/0x1a0
> > [177246.380735][T1995658]  do_exit+0x97b/0x2230
> > [177246.388056][T1995658]  ? _raw_write_lock_irqsave+0xe0/0xe0
> > [177246.388062][T1995658]  ? mm_update_next_owner+0x750/0x750
> > [177246.388067][T1995658]  ? _raw_spin_lock_irq+0x82/0xd0
> > [177246.413105][T1995658]  do_group_exit+0xda/0x2a0
> > [177246.420604][T1995658]  get_signal+0x3be/0x1e50
> > [177246.427936][T1995658]  arch_do_signal_or_restart+0x244/0x17f0
> > [177246.427943][T1995658]  ? audit_log_exit+0x2690/0x2690
> > [177246.444593][T1995658]  ? shmem_evict_inode+0xad0/0xad0
> > [177246.444601][T1995658]  ? get_sigframe_size+0x10/0x10
> > [177246.460403][T1995658]  ? __seccomp_filter+0x117/0xd60
> > [177246.468231][T1995658]  ? audit_alloc_name+0x440/0x440
> > [177246.468236][T1995658]  ? get_nth_filter.part.0+0x220/0x220
> > [177246.468239][T1995658]  ? __audit_syscall_exit+0x794/0xa80
> > [177246.468244][T1995658]  exit_to_user_mode_prepare+0xcb/0x120
> > [177246.486130][T2343516] BUG: Bad page state in process dnsdist  pfn:3ac443
> > [177246.492473][T1995658]  syscall_exit_to_user_mode+0x1d/0x40
> > [177246.500835][T2343516] page:000000000df8ed4b refcount:0 mapcount:0
> > mapping:0000000000000000 index:0x1 pfn:0x3ac443
> > [177246.510416][T1995658]  do_syscall_64+0x4d/0x90
> > [177246.518772][T2343516] flags:
> > 0x2ffff80000000a(referenced|dirty|node=0|zone=2|lastcpupid=0x1ffff)
> > [177246.532071][T1995658]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [177246.539574][T2343516] raw: 002ffff80000000a dead000000000100
> > dead000000000122 0000000000000000
> > [177246.551516][T1995658] RIP: 0033:0x4890ca
> > [177246.560640][T2343516] raw: 0000000000000001 0000000000000000
> > 00000000ffffffff 0000000000000000
> > [177246.572553][T1995658] Code: Unable to access opcode bytes at RIP 0x4890a0.
> > [177246.572556][T1995658] RSP: 002b:000000c000508608 EFLAGS: 00000216
> > [177246.579708][T2343516] page dumped because:
> > PAGE_FLAGS_CHECK_AT_PREP flag(s) set
> > [177246.591667][T1995658]  ORIG_RAX: 000000000000011d
> > [177246.601947][T2343516] Modules linked in: xt_hashlimit
> > [177246.611442][T1995658] RAX: 0000000000000000 RBX: 000000c00003e000
> > RCX: 00000000004890ca
> > [177246.622186][T2343516]  xt_connlimit nf_conncount
> > [177246.630505][T1995658] RDX: 0000000036466000 RSI: 0000000000000003
> > RDI: 0000000000000010
> > [177246.639127][T2343516]  ip_set_hash_netport xt_length
> > [177246.650831][T1995658] RBP: 000000c000508660 R08: 0000000000000000
> > R09: 0000000000000000
> > [177246.659146][T2343516]  esp4 sit
> > [177246.670801][T1995658] R10: 000000000676b000 R11: 0000000000000216
> > R12: 0000000000000009
> > [177246.679548][T2343516]  ipip tunnel4
> > [177246.691327][T1995658] R13: 000000c00129d8b8 R14: 0000000000000001
> > R15: 0000000000000000
> > [177246.691332][T1995658]  </TASK>
> > [177246.698177][T2343516]  nft_numgen
> > [177246.710054][T1995658] Modules linked in: xt_hashlimit
> > [177246.717376][T2343516]  nft_ct ip_gre
> > [177246.729337][T1995658]  xt_connlimit nf_conncount
> > [177246.736385][T2343516]  gre xfrm_user
> > [177246.743566][T1995658]  ip_set_hash_netport xt_length
> > [177246.752511][T2343516]  xfrm_algo tcp_diag
> > [177246.759923][T1995658]  esp4 sit
> > [177246.768447][T2343516]  udp_diag inet_diag
> > [177246.775861][T1995658]  ipip tunnel4
> > [177246.784702][T2343516]  fou6 fou
> > [177246.792590][T1995658]  nft_numgen nft_ct
> > [177246.799633][T2343516]  ip6_tunnel tunnel6
> > [177246.807521][T1995658]  ip_gre gre
> > [177246.814891][T2343516]  ip_tunnel
> > [177246.821906][T1995658]  xfrm_user xfrm_algo
> > [177246.829831][T2343516]  ip6_udp_tunnel udp_tunnel
> > [177246.837687][T1995658]  tcp_diag udp_diag
> > [177246.844899][T2343516]  cls_bpf tls
> > [177246.851988][T1995658]  inet_diag fou6
> > [177246.860054][T2343516]  xt_NFLOG
> > [177246.868476][T1995658]  fou ip6_tunnel
> > [177246.876205][T2343516]  xt_statistic
> > [177246.883435][T1995658]  tunnel6 ip_tunnel
> > [177246.890967][T2343516]  nft_compat
> > [177246.897928][T1995658]  ip6_udp_tunnel udp_tunnel
> > [177246.905366][T2343516]  veth
> > [177246.912707][T1995658]  cls_bpf tls
> > [177246.920424][T2343516]  tun overlay
> > [177246.927451][T1995658]  xt_NFLOG xt_statistic
> > [177246.935692][T2343516]  macvlan
> > [177246.941990][T1995658]  nft_compat veth
> > [177246.948848][T2343516]  sch_ingress raid0
> > [177246.955617][T1995658]  tun overlay
> > [177246.963226][T2343516]  md_mod essiv
> > [177246.969540][T1995658]  macvlan sch_ingress
> > [177246.976443][T2343516]  dm_crypt trusted
> > [177246.983523][T1995658]  raid0 md_mod
> > [177246.990026][T2343516]  asn1_encoder tee
> > [177246.996500][T1995658]  essiv dm_crypt
> > [177247.003663][T2343516]  dm_mod dax
> > [177247.010435][T1995658]  trusted asn1_encoder
> > [177247.016794][T2343516]  nfnetlink_log nft_log
> > [177247.023445][T1995658]  tee dm_mod
> > [177247.029892][T2343516]  nft_limit
> > [177247.035861][T1995658]  dax nfnetlink_log
> > [177247.042663][T2343516]  nft_counter nf_tables
> > [177247.049640][T1995658]  nft_log
> > [177247.055470][T2343516]  ip6table_nat
> > [177247.061129][T1995658]  nft_limit nft_counter
> > [177247.067401][T2343516]  ip6table_mangle
> > [177247.073987][T1995658]  nf_tables ip6table_nat
> > [177247.079498][T2343516]  ip6table_security ip6table_raw
> > [177247.085277][T1995658]  ip6table_mangle
> > [177247.091991][T2343516]  ip6table_filter ip6_tables
> > [177247.098034][T1995658]  ip6table_security ip6table_raw
> > [177247.104695][T2343516]  xt_nat iptable_nat
> > [177247.112034][T1995658]  ip6table_filter ip6_tables
> > [177247.118039][T2343516]  nf_nat xt_TCPMSS
> > [177247.125004][T1995658]  xt_nat iptable_nat
> > [177247.132376][T2343516]  xt_u32
> > [177247.138705][T1995658]  nf_nat xt_TCPMSS xt_u32
> > [177247.145715][T2343516]  xt_connmark
> > [177247.151843][T1995658]  xt_connmark iptable_mangle xt_owner
> > [177247.158146][T2343516]  iptable_mangle
> > [177247.163422][T1995658]  xt_CT iptable_raw
> > [177247.170185][T2343516]  xt_owner xt_CT
> > [177247.175921][T1995658]  xt_state xt_bpf
> > [177247.183751][T2343516]  iptable_raw xt_state
> > [177247.189765][T1995658]  xt_mark xt_conntrack
> > [177247.196014][T2343516]  xt_bpf xt_mark
> > [177247.202024][T1995658]  xt_multiport xt_comment xt_tcpudp
> > [177247.208091][T2343516]  xt_conntrack
> > [177247.214620][T1995658]  xt_set xt_tcpmss iptable_filter
> > [177247.221195][T2343516]  xt_multiport
> > [177247.227181][T1995658]  ip_set_hash_net
> > [177247.234862][T2343516]  xt_comment xt_tcpudp
> > [177247.240694][T1995658]  ip_set_hash_ip ip_set
> > [177247.248244][T2343516]  xt_set xt_tcpmss
> > [177247.254108][T1995658]  nfnetlink sch_fq
> > [177247.260257][T2343516]  iptable_filter ip_set_hash_net
> > [177247.266823][T1995658]  tcp_bbr nf_conntrack
> > [177247.273564][T2343516]  ip_set_hash_ip ip_set
> > [177247.279798][T1995658]  nf_defrag_ipv6 nf_defrag_ipv4
> > [177247.286054][T2343516]  nfnetlink sch_fq
> > [177247.293537][T1995658]  8021q garp
> > [177247.300146][T2343516]  tcp_bbr nf_conntrack
> > [177247.306851][T1995658]  stp mrp
> > [177247.314280][T2343516]  nf_defrag_ipv6
> > [177247.320726][T1995658]  llc skx_edac
> > [177247.326501][T2343516]  nf_defrag_ipv4
> > [177247.333200][T1995658]  x86_pkg_temp_thermal
> > [177247.338746][T2343516]  8021q garp
> > [177247.344900][T1995658]  kvm_intel kvm
> > [177247.350878][T2343516]  stp mrp
> > [177247.356981][T1995658]  irqbypass crc32_pclmul
> > [177247.363642][T2343516]  llc skx_edac
> > [177247.369407][T1995658]  crc32c_intel aesni_intel
> > [177247.375433][T2343516]  x86_pkg_temp_thermal kvm_intel
> > [177247.380949][T1995658]  rapl intel_cstate
> > [177247.387818][T2343516]  kvm irqbypass
> > [177247.393792][T1995658]  ipmi_ssif sfc
> > [177247.400823][T2343516]  crc32_pclmul crc32c_intel
> > [177247.408402][T1995658]  intel_uncore
> > [177247.414816][T2343516]  aesni_intel rapl
> > [177247.420895][T1995658]  i2c_i801 xhci_pci
> > [177247.426998][T2343516]  intel_cstate
> > [177247.434174][T1995658]  i2c_smbus acpi_ipmi
> > [177247.440133][T2343516]  ipmi_ssif
> > [177247.446447][T1995658]  i40e mdio
> > [177247.452828][T2343516]  sfc intel_uncore
> > [177247.458764][T1995658]  ioatdma i2c_core xhci_hcd tpm_crb
> > [177247.465315][T2343516]  i2c_i801
> > [177247.470986][T1995658]  dca ipmi_si
> > [177247.476630][T2343516]  xhci_pci i2c_smbus
> > [177247.482924][T1995658]  ipmi_devintf ipmi_msghandler
> > [177247.490884][T2343516]  acpi_ipmi
> > [177247.496448][T1995658]  tpm_tis tpm_tis_core
> > [177247.502266][T2343516]  i40e
> > [177247.508870][T1995658]  tpm tiny_power_button
> > [177247.516227][T2343516]  mdio
> > [177247.521960][T1995658]  button fuse
> > [177247.528752][T2343516]  ioatdma i2c_core
> > [177247.534017][T1995658]  efivarfs
> > [177247.540736][T2343516]  xhci_hcd tpm_crb
> > [177247.545945][T1995658]  ip_tables x_tables
> > [177247.551808][T2343516]  dca ipmi_si
> > [177247.558097][T1995658]  bcmcrypt(O)
> > [177247.563798][T2343516]  ipmi_devintf ipmi_msghandler
> > [177247.570085][T1995658]  crypto_simd
> > [177247.576506][T2343516]  tpm_tis tpm_tis_core
> > [177247.582311][T1995658]  cryptd [last unloaded: kheaders]
> > [177247.582349][T1995658] ---[ end trace bf693a2532f213e5 ]---
> > [177247.588090][T2343516]  tpm tiny_power_button button fuse efivarfs ip_tables
> > [177247.671277][T1995658] RIP: 0010:workingset_activation+0x175/0x2f0
> > [177247.673366][T2343516]  x_tables bcmcrypt(O) crypto_simd
> > [177247.682978][T1995658] Code: 80 3c 02 00 0f 85 58 01 00 00 4a 8b ac
> > ed b8 0f 00 00 48 8d bd 88 00 00 00 48 b8 00 00 00 00 00 fc ff df 48
> > 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 0d 01 00 00 4c 3b a5 88 00 00 00
> > 0f 85 c3 00 00
> > [177247.691625][T2343516]  cryptd [last unloaded: kheaders]
> > [177247.691633][T2343516] CPU: 30 PID: 2343516 Comm: dnsdist Tainted:
> > G    B D W  O      5.15.5-cloudflare-kasan-2021.11.11 #1
> > [177247.691639][T2343516] Hardware name: Quanta Cloud Technology Inc.
> > QuantaPlex T42S-2U/T42S-2U MB (Lewisburg-4), BIOS 3B16.Q102 02/19/2020
> > [177247.691642][T2343516] Call Trace:
> > [177247.691645][T2343516]  <TASK>
> > [177247.699575][T1995658] RSP: 0018:ffff8881ec51f3c0 EFLAGS: 00010206
> > [177247.724584][T2343516]  dump_stack_lvl+0x34/0x44
> > [177247.724595][T2343516]  bad_page.cold+0xc0/0xe1
> > [177247.732606][T1995658]
> > [177247.746616][T2343516]  rmqueue_bulk+0x8e5/0xe00
> > [177247.746627][T2343516]  ? find_suitable_fallback+0x470/0x470
> > [177247.764885][T1995658] RAX: dffffc0000000000 RBX: ffffea003237e480
> > RCX: ffffffffa25388a6
> > [177247.771268][T2343516]  get_page_from_freelist+0x18ff/0x2920
> > [177247.771279][T2343516]  ? __zone_watermark_ok+0x340/0x340
> > [177247.777285][T1995658] RDX: 0000000000000011 RSI: 0000000000000246
> > RDI: 0000000000000088
> > [177247.786537][T2343516]  __alloc_pages+0x2ac/0x5b0
> > [177247.786543][T2343516]  ? __alloc_pages_slowpath.constprop.0+0x1e20/0x1e20
> > [177247.786548][T2343516]  alloc_pages_vma+0xbc/0x570
> > [177247.794244][T1995658] RBP: 0000000000000000 R08: 0000000000000001
> > R09: ffffffffa6ced907
> > [177247.801816][T2343516]  __handle_mm_fault+0x14d6/0x3a00
> > [177247.801822][T2343516]  ? vm_iomap_memory+0x1d0/0x1d0
> > [177247.801827][T2343516]  ? down_read_trylock+0xeb/0x180
> > [177247.807247][T1995658] R10: fffffbfff4d9db20 R11: 0000000000000010
> > R12: ffff88983ffde000
> > [177247.814873][T2343516]  handle_mm_fault+0x1cc/0x650
> > [177247.814878][T2343516]  do_user_addr_fault+0x303/0xd40
> > [177247.814885][T2343516]  exc_page_fault+0x52/0xb0
> > [177247.823577][T1995658] R13: 0000000000000000 R14: ffff8897a9bbbde0
> > R15: 0600000c8df93b77
> > [177247.834904][T2343516]  ? asm_exc_page_fault+0x8/0x30
> > [177247.834911][T2343516]  asm_exc_page_fault+0x1e/0x30
> > [177247.834915][T2343516] RIP: 0033:0x557a8ea64f19
> > [177247.843738][T1995658] FS:  0000000000000000(0000)
> > GS:ffff8897a9b80000(0000) knlGS:0000000000000000
> > [177247.852290][T2343516] Code: c7 80 04 ff ff ff 00 00 00 00 c7 80 18
> > ff ff ff 00 00 00 00 48 c7 80 20 ff ff ff 00 00 00 00 48 c7 80 28 ff
> > ff ff 00 00 00 00 <c6> 80 30 ff ff ff 01 c6 80 38 ff ff ff 01 c6 80 39
> > ff ff ff 00 48
> > [177247.852295][T2343516] RSP: 002b:00007fff6f3f11c0 EFLAGS: 00010246
> > [177247.852299][T2343516] RAX: 0000557a9840d0d0 RBX: 0000557a92334cc0
> > RCX: 0000000000000000
> > [177247.852301][T2343516] RDX: ffffffffffffffff RSI: 0000000000000000
> > RDI: 0000000000000000
> > [177247.852304][T2343516] RBP: 00000000000099d4 R08: 0000000000000000
> > R09: 0000000000000000
> > [177247.863709][T1995658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [177247.871673][T2343516] R10: 0000000000000000 R11: 0000000000000000
> > R12: 0000557a991452f0
> > [177247.871676][T2343516] R13: 00007fff6f3f1420 R14: 00007fff6f3f1440
> > R15: 0000000000000001
> > [177247.871680][T2343516]  </TASK>
> >
> > After this the machine starts spitting some traces starting with:
> >
> > [177247.871683][T2343516] BUG: Bad page state in process <some comm
> > name>  pfn:fe680a
> >
> > And eventually gradually locks up:
> >
> > NMI watchdog: Watchdog detected hard LOCKUP on cpu 81
> >
> > The comment in kvm_main.c before the code mentioned in the first
> > warning states that the warning is there to indicate incorrect usage
> > of the function - and probably it is, given the consequences.
> >
> > About our workload: this bug is most likely triggered by gvisor [1]
> > with the KVM backend as we don't have any other KVM users on these
> > systems.
> >
> > We suspect it was not triggered before as kernels before 5.15 did not
> > have TDP MMU enabled by default [2].
> >
> > It seems we even want to remove this warning as overaggressive [3],
> > however it is indicative in this case.
> >
> > Unfortunately, I couldn't easily reproduce the issue synthetically
> > (tried both running the KVM selftests as well as gvisor KVM tests).
> > Any help/pointers would be appreciated.
> >
> > [1]: https://github.com/google/gvisor
> > [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=71ba3f3189c78f756a659568fb473600fd78f207
> > [3]: https://lore.kernel.org/kvm/20211129034317.2964790-5-stevensd@xxxxxxxxxx/
> >
>



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux