Hi folks on the list, We hit following softlockup during test SGX VM with 5.15.125 kernel on Icelake server: [ 969.410230] CPU: 64 PID: 35093 Comm: qemu-7.1 Kdump: loaded Tainted: G W O L 5.15.125-pserver #5.15.125-1+feature+linux+5.15.y+20230809.0556+bda73c35~deb11 [ 969.410233] Hardware name: Dell Inc. PowerEdge R650/0PYXKY, BIOS 1.9.2 11/17/2022 [ 969.410234] RIP: 0010:sgx_vepc_free_page+0x2e/0x80 [ 969.410242] Code: 00 41 54 8b 07 48 89 f9 48 c1 e0 05 48 2b 88 b0 a1 45 a0 48 c1 e9 05 48 c1 e1 0c 48 03 88 a8 a1 45 a0 b8 03 00 00 00 0f 01 cf <41> 89 c4 85 c0 74 0f 83 f8 0d 75 19 44 89 e0 41 5c c3 cc cc cc cc [ 969.410244] RSP: 0018:ff6646e08ce77e20 EFLAGS: 00000202 [ 969.410247] RAX: 0000000000000000 RBX: 000000000008001f RCX: ff6646f9f0ee6000 [ 969.410248] RDX: 000000000000003d RSI: ff338d6614c00b68 RDI: ff6646f0d930fcc0 [ 969.410249] RBP: ff338d6563367700 R08: ff6646f0d930fcc0 R09: ffffffffffffffff [ 969.410250] R10: 0000000002000000 R11: ff338d65c29f9f10 R12: ff6646e08ce77e38 [ 969.410252] R13: ff338d65891848e0 R14: ff338d64884a9c80 R15: 0000000000000000 [ 969.410253] FS: 0000000000000000(0000) GS:ff338da3ff600000(0000) knlGS:0000000000000000 [ 969.410255] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 969.410256] CR2: 00007f23a0002806 CR3: 00000024ce60a006 CR4: 0000000000773ee0 [ 969.410257] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 969.410258] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 969.410260] PKRU: 55555554 [ 969.410260] Call Trace: [ 969.410262] <IRQ> [ 969.410264] ? watchdog_timer_fn+0x1b4/0x210 [ 969.410269] ? lockup_detector_update_enable+0x50/0x50 [ 969.410272] ? __hrtimer_run_queues+0x127/0x280 [ 969.410276] ? hrtimer_interrupt+0xfc/0x210 [ 969.410279] ? __sysvec_apic_timer_interrupt+0x59/0xd0 [ 969.410281] ? sysvec_apic_timer_interrupt+0x6d/0x90 [ 969.410285] </IRQ> [ 969.410286] <TASK> [ 969.410287] ? asm_sysvec_apic_timer_interrupt+0x16/0x20 [ 969.410291] ? sgx_vepc_free_page+0x2e/0x80 [ 969.410293] sgx_vepc_release+0x65/0x220 [ 969.410296] __fput+0x89/0x250 [ 969.410301] task_work_run+0x59/0x90 [ 969.410306] do_exit+0x337/0x9a0 [ 969.410309] ? do_user_addr_fault+0x1cd/0x660 [ 969.410312] __x64_sys_exit+0x17/0x20 [ 969.410314] do_syscall_64+0x38/0x90 [ 969.410317] entry_SYSCALL_64_after_hwframe+0x67/0xd1 [ 969.410319] RIP: 0033:0x7f23afe93f56 [ 969.410322] Code: Unable to access opcode bytes at RIP 0x7f23afe93f2c. [ 969.410323] RSP: 002b:00007f23a5dd6800 EFLAGS: 00000246 ORIG_RAX: 000000000000003c [ 969.410325] RAX: ffffffffffffffda RBX: 00007f23a5dd7700 RCX: 00007f23afe93f56 [ 969.410326] RDX: 000000000000003c RSI: 00000000007fb000 RDI: 0000000000000000 [ 969.410327] RBP: 00007f23a55d7000 R08: 00007f23a5dd66b0 R09: 00007f23a5dd6630 [ 969.410328] R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffe693cbeee [ 969.410329] R13: 00007ffe693cbeef R14: 00007f23a5dd68c0 R15: 0000000000802000 [ 969.410331] </TASK> is it known? I guess something like this is needed? diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c index 6a77a14eee38..9f863250f08d 100644 --- a/arch/x86/kernel/cpu/sgx/virt.c +++ b/arch/x86/kernel/cpu/sgx/virt.c @@ -204,6 +204,7 @@ static int sgx_vepc_release(struct inode *inode, struct file *file) continue; xa_erase(&vepc->page_array, index); + cond_resched(); } /* @@ -222,6 +223,7 @@ static int sgx_vepc_release(struct inode *inode, struct file *file) list_add_tail(&epc_page->list, &secs_pages); xa_erase(&vepc->page_array, index); + cond_resched(); } /* QEMU commandline: qemu-7.1 -name Serverwittchen46bd476e-a1f1-475e-8e65-4ef2c44f63f7 -m 4096,slots=252,maxmem=256G -M pc-i440fx-7.0 -enable-kvm -no-user-config -nodefaults -rtc base=utc -device pvpanic -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,max-bytes=1024,period=1000,bus=pci.0,addr=0x3 -netdev tap,ifname=n020166658e84,id=hostnet6,vhost=on,vhostforce=on,vnet_hdr=on,script=no,downscript=no -device virtio-net-pci,netdev=hostnet6,id=net6,mac=02:01:66:65:8e:84,bus=pci.0,addr=0x6 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -device usb-tablet,id=input0 -vnc 0.0.0.0:122 -vga qxl -cpu Icelake-Server-v6,+vmx,+sgx,+sgx1,+sgx2,+sgx-kss,+sgxlc,+sgx-exinfo,+sgx-debug,+sgx-mode64,+sgx-provisionkey,+sgx-tokenkey,enforce,+hv-relaxed,+hv-vapic,+hv-time,+hv-runtime,hv-spinlocks=0x1fff,+hv-vpindex,+hv-synic,+hv-stimer,+hv-tlbflush,hv-no-nonarch-coresharing=on,hv-ipi -smp 4 -object iothread,id=iothread5 -drive file=/dev/md0,if=none,id=drive-virtio-disk5,format=raw,snapshot=off,node-name=node-virtio-disk5,aio=native,discard=off,cache=none -device virtio-blk-pci,serial=eaaddb33707b7b1d,bus=pci.0,addr=0x5,drive=drive-virtio-disk5,iothread=iothread5,num-queues=2,discard=off,id=virtio-disk5,bootindex=1 -S -msg timestamp=on -object memory-backend-epc,id=mem1,size=65536M,prealloc=on -M sgx-epc.0.memdev=mem1,sgx-epc.0.node=0 -qmp unix:/opt/profitbricks/vcb/pbkvm/mon/46bd476e-a1f1-475e-8e65-4ef2c44f63f7.sock,server,nowait -qmp unix:/opt/profitbricks/vcb/pbkvm/metrics/46bd476e-a1f1-475e-8e65-4ef2c44f63f7.sock,server,nowait -uuid 46bd476e-a1f1-475e-8e65-4ef2c44f63f7 Regards! Jinpu Wang