On Thu, Feb 9, 2023 at 11:25 AM Mohammed Gamal <mgamal@xxxxxxxxxx> wrote: > > > On Wed, Feb 8, 2023 at 9:03 PM Dexuan Cui <decui@xxxxxxxxxxxxx> wrote: > > > > > From: Mohammed Gamal <mgamal@xxxxxxxxxx> > > > Sent: Wednesday, February 8, 2023 3:34 AM > > > > > > relid2channel() assumes vmbus channel array to be allocated when called. > > > However, if the guest receives a vmbus interrupt during driver initialization > > > before vmbus_connect() is called or if vmbus_connect() fails, the vmbus > > > interrupt service routine is called which in turn calls relid2channel() > > > and can cause a null pointer dereference. > > > > Before vmbus_connect() is called or if vmbus_connect() fails, there should > > be no VMBus channel related interrupts at all, so relid2channel() can't be > > called. > > > > Can you please share the log or at least the crash call-stack? > > I'm curious how the crash can happen. > > > > Hi Dexuan, > We saw this when triggering a crash with kdump enabled with > echo 'c' > /proc/sysrq-trigger > > When the new kernel boots, we see this stack trace: > > [ 21.790653] BUG: kernel NULL pointer dereference, address: 0000000000000070 > [ 21.816550] #PF: supervisor read access in kernel mode > [ 21.835697] #PF: error_code(0x0000) - not-present page > [ 21.855499] PGD 0 P4D 0 > [ 21.865471] Oops: 0000 [#1] PREEMPT SMP NOPTI > [ 21.881150] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.14.0-247.el9.x86_64 # 1 > [ 21.906679] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 05/18/2018 > [ 21.939413] RIP: 0010:relid2channel+0x1a/0x30 [hv_vmbus] > [ 21.958240] Code: 00 00 00 e8 a8 01 db c0 e9 78 ff ff ff 0f 1f 00 0f 1f 44 00 00 81 ff ff 07 00 00 77 15 48 8b 05 ac 31 01 00 89 ff 48 8d 04 f8 <48> 8b 00 e9 de ef b2 c1 0f 0b 31 c0 e9 d5 ef b2 c1 0f 1f 44 00 00 > [ 22.022266] RSP: 0018:ffffc90000003f90 EFLAGS: 00010097 > [ 22.040588] RAX: 0000000000000070 RBX: ffff88807a4ef200 RCX: 0000000000000000 > [ 22.065670] RDX: ffffffff82a1a940 RSI: 0000000000000000 RDI: 000000000000000e > [ 22.090778] RBP: 000000000000000e R08: 0000000000000000 R09: 0000000000000000 > [ 22.115947] R10: 000000000000000e R11: ffffc90000003ff8 R12: 0000000000000001 > [ 22.140901] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 22.165344] FS: 0000000000000000(0000) GS:ffff88807e600000(0000) knlGS:00000 00000000000 > [ 22.192958] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 22.213376] CR2: 0000000000000070 CR3: 000000007ae40000 CR4: 00000000003506b0 > [ 22.238792] Call Trace: > [ 22.248268] <IRQ> > [ 22.256236] vmbus_chan_sched.isra.0+0x67/0x190 [hv_vmbus] > [ 22.275822] vmbus_isr+0x21/0xd0 [hv_vmbus] > [ 22.290906] __sysvec_hyperv_callback+0x2e/0x60 > [ 22.307021] sysvec_hyperv_callback+0x6d/0x90 > [ 22.322530] </IRQ> > [ 22.330559] <TASK> > [ 22.338573] asm_sysvec_hyperv_callback+0x16/0x20 > [ 22.355090] RIP: 0010:default_idle+0x10/0x20 > [ 22.370572] Code: 00 0f ae f0 0f ae 38 0f ae f0 eb b5 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 eb 07 0f 00 2d 7e 6e 4d 00 fb f4 <e9> 4b dc 2c 00 cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 65 > [ 22.435289] RSP: 0018:ffffffff82a03ea8 EFLAGS: 00000202 > [ 22.453695] RAX: ffffffff81b33ea0 RBX: ffffffff82a1a940 RCX: 0000000000000000 > [ 22.478759] RDX: 00000000000000cd RSI: 0000000000000087 RDI: 00000000000000ce > [ 22.503863] RBP: 0000000000000000 R08: 0138a8c77d17acc3 R09: 0000000000000001 > [ 22.528774] R10: 0000000000000400 R11: 00000000005d0eea R12: 0000000000000000 > [ 22.553857] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 22.578981] ? mwait_idle+0x80/0x80 > [ 22.591660] ? mwait_idle+0x80/0x80 > [ 22.604332] default_idle_call+0x33/0xe0 > [ 22.618343] cpuidle_idle_call+0x15d/0x1c0 > [ 22.632969] ? read_hv_sched_clock_tsc+0x5/0x20 > [ 22.649218] do_idle+0x7b/0xe0 > [ 22.660741] cpu_startup_entry+0x19/0x20 > [ 22.674964] rest_init+0xca/0xd0 > [ 22.686907] arch_call_rest_init+0xa/0x14 > [ 22.701272] start_kernel+0x49e/0x4c0 > [ 22.714418] secondary_startup_64_no_verify+0xe5/0xeb > [ 22.732484] </TASK> > [ 22.740973] Modules linked in: hv_vmbus(+) serio_raw dm_mirror dm_region_hash dm_log dm_mod fuse overlay squashfs loop > [ 22.779694] CR2: 0000000000000070 > [ 22.792006] ---[ end trace 56dd24038e89124f ]--- > [ 22.808686] RIP: 0010:relid2channel+0x1a/0x30 [hv_vmbus] > [ 22.827388] Code: 00 00 00 e8 a8 01 db c0 e9 78 ff ff ff 0f 1f 00 0f 1f 44 00 00 81 ff ff 07 00 00 77 15 48 8b 05 ac 31 01 00 89 ff 48 8d 04 f8 <48> 8b 00 e9 de ef b2 c1 0f 0b 31 c0 e9 d5 ef b2 c1 0f 1f 44 00 00 > [ 22.891669] RSP: 0018:ffffc90000003f90 EFLAGS: 00010097 > [ 22.910168] RAX: 0000000000000070 RBX: ffff88807a4ef200 RCX: 0000000000000000 > [ 22.935445] RDX: ffffffff82a1a940 RSI: 0000000000000000 RDI: 000000000000000e > [ 22.958556] RBP: 000000000000000e R08: 0000000000000000 R09: 0000000000000000 > [ 22.979539] R10: 000000000000000e R11: ffffc90000003ff8 R12: 0000000000000001 > [ 23.004127] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 23.027088] FS: 0000000000000000(0000) GS:ffff88807e600000(0000) knlGS:00000 00000000000 > [ 23.053341] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 23.074013] CR2: 0000000000000070 CR3: 000000007ae40000 CR4: 00000000003506b0 > [ 23.099322] Kernel panic - not syncing: Fatal exception in interrupt > [ 23.121729] Kernel Offset: disabled Ugh, my mail client messed up the reply and it was marked as spam. Resending to the lists in plain text. See the reply above.