On Wed, Jan 22, 2025, Srikanth Aithal wrote: > Hello all, > > While running kvm selftests on AMD EPYC platform with 6.13.0-next-20250121 > below general protection fault is being hit. > > /Jan 22 00:45:35 kernel: Oops: general protection fault, probably for > non-canonical address 0xe659260b3c31e5e0: 0000 [#1] PREEMPT SMP NOPTI > Jan 22 00:45:35 kernel: CPU: 113 UID: 0 PID: 143333 Comm: memslot_perf_te > Not tainted 6.13.0-next-20250121-f066b5a6c7-98baed10f3f #1 > Jan 22 00:45:35 kernel: Hardware name: Dell Inc. PowerEdge R6515/07PXPY, > BIOS 2.14.1 12/17/2023 > Jan 22 00:45:35 kernel: RIP: 0010:__kmalloc_node_noprof+0xff/0x490 > Jan 22 00:45:35 kernel: Code: 0f 84 0b 01 00 00 84 c9 0f 85 03 01 00 00 41 > 83 fb ff 0f 85 e9 00 00 00 41 bb ff ff ff ff 41 8b 44 24 28 49 8b 34 24 48 > 01 f8 <48> 8b 18 48 89 c1 49 33 9c 24 b8 00 00 00 48 89 f8 48 0f c9 48 31 > Jan 22 00:45:35 kernel: RSP: 0018:ffffa77176403ab0 EFLAGS: 00010282 > Jan 22 00:45:35 kernel: RAX: e659260b3c31e5e0 RBX: ffffed7142251180 RCX: > 0000000000000000 > Jan 22 00:45:35 kernel: RDX: 0000000003106071 RSI: 000000000003b080 RDI: > e659260b3c31e5e0 > Jan 22 00:45:35 kernel: RBP: ffffa77176403b10 R08: 0000000000000000 R09: > ffffa771c9605000 > Jan 22 00:45:35 kernel: R10: ffffa77176403b28 R11: 00000000ffffffff R12: > ffff92a240044400 > Jan 22 00:45:35 kernel: R13: 0000000000000008 R14: 00000000ffffffff R15: > 0000000000000dc0 > Jan 22 00:45:35 kernel: FS: 00007f91abd0d740(0000) > GS:ffff92e13e880000(0000) knlGS:0000000000000000 > Jan 22 00:45:35 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Jan 22 00:45:35 kernel: CR2: 000000002346c3c8 CR3: 000000223fbb6004 CR4: > 0000000000770ef0 > Jan 22 00:45:35 kernel: PKRU: 55555554 > Jan 22 00:45:35 kernel: Call Trace: > Jan 22 00:45:35 kernel: <TASK> > Jan 22 00:45:35 kernel: ? show_regs+0x6d/0x80 > Jan 22 00:45:35 kernel: ? die_addr+0x3c/0xa0 > Jan 22 00:45:35 kernel: ? exc_general_protection+0x248/0x470 > Jan 22 00:45:35 kernel: ? asm_exc_general_protection+0x2b/0x30 > Jan 22 00:45:35 kernel: ? __kmalloc_node_noprof+0xff/0x490 > Jan 22 00:45:35 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 > Jan 22 00:45:35 kernel: ? __get_vm_area_node+0xd2/0x140 > Jan 22 00:45:35 kernel: ? __vmalloc_node_range_noprof+0x2ec/0x7f0 > Jan 22 00:45:35 kernel: __vmalloc_node_range_noprof+0x2ec/0x7f0 > Jan 22 00:45:35 kernel: ? __vmalloc_node_range_noprof+0x2ec/0x7f0 > Jan 22 00:45:35 kernel: ? __vcalloc_noprof+0x26/0x40 > Jan 22 00:45:35 kernel: __vmalloc_noprof+0x4d/0x60 > Jan 22 00:45:35 kernel: ? __vcalloc_noprof+0x26/0x40 > Jan 22 00:45:35 kernel: __vcalloc_noprof+0x26/0x40 > Jan 22 00:45:35 kernel: kvm_arch_prepare_memory_region+0x13f/0x300 [kvm] .. > _Recreate steps:_ > > 1. Build and Install next-20250121 kernel with attached kernel_config. > > 2. Build and run selftests/kvm component from linux next-20250121 tree > > Issue currently seem to be hit intermittently, I am trying to find more > reliable recreations steps, meantime wanted to post the issue here for > awareness/getting any pointers. I would be surprised if this has anything to do with KVM, KVM is simply doing a vcalloc(), and it's not even a particularly large allocation for this test. What code does "__kmalloc_node_noprof+0xff/0x490" correspond to? Without that info, you're unlikely to get any help/ideas.