RE: hyper_bf soft lockup on Azure Gen2 VM when taking kdump or executing kexec

Michael Kelley <mhklinux@xxxxxxxxxxx> · Fri, 7 Feb 2025 18:35:28 +0000

From: Saurabh Singh Sengar <ssengar@xxxxxxxxxxxxx> Sent: Friday, February 7, 2025 6:06 AM
> 
> Thanks Michael, for the analysis.
> 
> I have tried the kdump steps on Oracle 9.4, 6.13.0 kernel as well. Although I couldn't see
> the soft lockup issue I see some other VMBus failures. But I agree the bootup is
> extremely slow, which should be due to same reason.

Yes, I would also think it is the same underlying reason.

> 
> My system is having newer UEFI version, wondering if the latest UEFI version
> (UEFI Release v4.1 08/23/2024) causing this difference in behaviour.

I've seen both the original behavior that Thomas Tai reported, as well as
the extremely slow behavior. In my experiments, it seems to depend on
the Azure V size being used, though I didn't fully investigate. Originally I
was using a DS5_v2 VM (which is what Thomas was using) and saw the
same "soft lockup" as Thomas. Then I moved to a D8ds_v5 VM, which
is somewhat cheaper, and was seeing the very slow behavior.

See my separate email from this morning with a full explanation of the
root cause.

Michael

> 
> Relevant part of the logs:
> ---------------------------------------------------------
> echo 1 > /proc/sys/kernel/sysrq
> echo c > /proc/sysrq-trigger
> [  982.948352] sysrq: Trigger a crash
> [  982.949553] Kernel panic - not syncing: sysrq triggered crash
> [  982.951515] CPU: 31 UID: 0 PID: 6938 Comm: bash Kdump: loaded Not tainted
> 6.13.0 #1
> [  982.954115] Hardware name: Microsoft Corporation Virtual Machine/Virtual
> Machine, BIOS Hyper-V UEFI Release v4.1 08/23/2024
> [  982.957641] Call Trace:
> [  982.958508]  <TASK>
> [  982.959251]  panic+0x37e/0x3b0
> [  982.960373]  ? _printk+0x64/0x90
> [  982.961452]  sysrq_handle_crash+0x1a/0x20
> [  982.962840]  __handle_sysrq+0x9b/0x190
> [  982.964145]  write_sysrq_trigger+0x5f/0x80
> [  982.965578]  proc_reg_write+0x59/0xb0
> [  982.966905]  vfs_write+0x111/0x470
> [  982.968004]  ? __count_memcg_events+0xbf/0x150
> [  982.969432]  ? count_memcg_events.constprop.0+0x26/0x50
> [  982.971190]  ksys_write+0x6e/0xf0
> [  982.972307]  do_syscall_64+0x62/0x180
> [  982.973438]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  982.975102] RIP: 0033:0x7f3d570fdbd7
> [  982.976421] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa
> 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3
> 48 83 ec 28 48 89 54 24 18 48 89 74 24
> [  982.982893] RSP: 002b:00007fff6d613c48 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [  982.985424] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f3d570fdbd7
> [  982.987613] RDX: 0000000000000002 RSI: 000056362a928470 RDI:
> 0000000000000001
> [  982.989774] RBP: 000056362a928470 R08: 0000000000000000 R09:
> 00007f3d571b0d40
> [  982.992109] R10: 00007f3d571b0c40 R11: 0000000000000246 R12:
> 0000000000000002
> [  982.994321] R13: 00007f3d571fa780 R14: 0000000000000002 R15:
> 00007f3d571f59e0
> [  982.996461]  </TASK>
> [  982.998317] Kernel Offset: 0x10c00000 from 0xffffffff81000000 (relocation range:
> 0xffffffff80000000-0xffffffffbfffffff)
> [    0.000000] Linux version 6.13.0 (lisatest@lisa--505-e0-n0) (gcc (GCC) 11.5.0
> 20240719 (Red Hat 11.5.0-2.0.1), GNU ld version 2.35.2-54.0.1.el9) #1 SMP
> PREEMPT_DYNAMIC Thu Feb  6 10:05:27 UTC 2025
> [    0.000000] Command line: elfcorehdr=0xd000000
> BOOT_IMAGE=(hd0,gpt1)/vmlinuz-6.13.0 ro console=tty0 console=ttyS0,115200n8
> rd.lvm.vg=rootvg irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off
> numa=off udev.children-max=2 panic=10 acpi_no_memhotplug
> transparent_hugepage=never nokaslr hest_disable novmcoredd cma=0 hugetlb_cma=0
> iommu=off disable_cpu_apicid=0
> [    0.000000] BIOS-provided physical RAM map:
> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff]
> reserved
> [    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009ffff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000000c0000-0x00000000000fffff] reserved
> [    0.000000] BIOS-e820: [mem 0x000000000d0e00b0-0x000000002cffffff] usable
> [    0.000000] BIOS-e820: [mem 0x000000003eead000-0x000000003eeb3fff]
> reserved
> [    0.000000] BIOS-e820: [mem 0x000000003ff41000-0x000000003ffc8fff] reserved
> [    0.000000] BIOS-e820: [mem 0x000000003ffc9000-0x000000003fffafff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x000000003fffb000-0x000000003fffefff] ACPI NVS
> [    0.000000] random: crng init done
> 
> <snip>
> 
> [    0.928063] Console: switching to colour frame buffer device 128x48
> [   13.391297] fb0: EFI VGA frame buffer device
> 
> <snip>
> 
> [  590.199511] hv_netvsc 7c1e527c-2980-7c1e-527c-29807c1e527c (unnamed
> net_device) (uninitialized): VF slot 1 added
> [  595.120270] Console: switching to colour dummy device 80x25
> [  605.203700] hyperv_fb: Time out on waiting vram location ack
> [  605.206161] iounmap: bad address 0000000005f4dac5
> [  605.207740] CPU: 0 UID: 0 PID: 30 Comm: kworker/u4:2 Not tainted 6.13.0 #1
> [  605.209984] Hardware name: Microsoft Corporation Virtual Machine/Virtual
> Machine, BIOS Hyper-V UEFI Release v4.1 08/23/2024
> [  605.213869] Workqueue: async async_run_entry_fn
> [  605.215601] Call Trace:
> [  605.216382]  <TASK>
> [  605.217123]  dump_stack_lvl+0x66/0x90
> [  605.218184]  hvfb_putmem+0x32/0x110 [hyperv_fb]
> [  605.219646]  hvfb_probe+0x27f/0x360 [hyperv_fb]
> [  605.221120]  vmbus_probe+0x3d/0xa0 [hv_vmbus]
> [  605.222623]  really_probe+0xd9/0x390
> [  605.223779]  __driver_probe_device+0x78/0x160
> [  605.225213]  driver_probe_device+0x1e/0xa0
> [  605.226591]  __driver_attach_async_helper+0x5e/0xe0
> [  605.228166]  async_run_entry_fn+0x34/0x130
> [  605.229681]  process_one_work+0x187/0x3b0
> [  605.231075]  worker_thread+0x24e/0x360
> [  605.232376]  ? __pfx_worker_thread+0x10/0x10
> [  605.233758]  kthread+0xd3/0x100
> [  605.234805]  ? __pfx_kthread+0x10/0x10
> [  605.236053]  ret_from_fork+0x34/0x50
> [  605.237251]  ? __pfx_kthread+0x10/0x10
> [  605.238519]  ret_from_fork_asm+0x1a/0x30
> [  605.239833]  </TASK>
> [  605.240855] hv_vmbus: probe failed for device 5620e0c7-8062-4dce-aeb7-
> 520c7ef76171 (-110)
> [  605.243404] hyperv_fb 5620e0c7-8062-4dce-aeb7-520c7ef76171: probe with
> driver hyperv_fb failed with error -110
> [  605.254672] hv_vmbus: registering driver hv_pci
> 
> 
> 
> 
> - Saurabh
> 
> > -----Original Message-----
> > From: Michael Kelley <mhklinux@xxxxxxxxxxx>
> > Sent: 07 February 2025 02:30
> > To: Michael Kelley <mhklinux@xxxxxxxxxxx>; Thomas Tai
> > <thomas.tai@xxxxxxxxxx>; mhkelley58@xxxxxxxxx; Haiyang Zhang
> > <haiyangz@xxxxxxxxxxxxx>; wei.liu@xxxxxxxxxx; Dexuan Cui
> > <decui@xxxxxxxxxxxxx>; drawat.floss@xxxxxxxxx; javierm@xxxxxxxxxx;
> > Helge Deller <deller@xxxxxx>; daniel@xxxxxxxx; airlied@xxxxxxxxx;
> > tzimmermann@xxxxxxx
> > Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx; linux-fbdev@xxxxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx; linux-hyperv@xxxxxxxxxxxxxxx
> > Subject: [EXTERNAL] RE: hyper_bf soft lockup on Azure Gen2 VM when taking
> > kdump or executing kexec
> >
> > From: Michael Kelley <mhklinux@xxxxxxxxxxx>
> > >
> > > From: Thomas Tai <thomas.tai@xxxxxxxxxx> Sent: Thursday, January 30,
> > > 2025 12:44 PM
> > > >
> > > > > -----Original Message-----
> > > > > From: Michael Kelley <mhklinux@xxxxxxxxxxx> Sent: Thursday,
> > > > > January 30, 2025 3:20 PM
> > > > >
> > > > > From: Thomas Tai <thomas.tai@xxxxxxxxxx> Sent: Thursday, January
> > > > > 30,
> > > > > 2025 10:50 AM
> > > > > >
> > > > > > Sorry for the typo in the subject title. It should have been
> > > > > > 'hyperv_fb soft lockup on Azure Gen2 VM when taking kdump or
> > executing kexec'
> > > > > >
> > > > > > Thomas
> > > > > >
> > > > > > >
> > > > > > > Hi Michael,
> > > > > > >
> > > > > > > We see an issue with the mainline kernel on the Azure Gen 2 VM
> > > > > > > when trying to induce a kernel panic with sysrq commands. The
> > > > > > > VM would hang with soft lockup. A similar issue happens when
> > executing kexec on the VM.
> > > > > > > This issue is seen only with Gen2 VMs(with UEFI boot). Gen1
> > > > > > > VMs with bios boot are fine.
> > > > > > >
> > > > > > > git bisect identifies the issue is cased by the commit
> > > > > > > 20ee2ae8c5899
> > > > > > > ("fbdev/hyperv_fb: Fix logic error for Gen2 VMs in hvfb_getmem()" ).
> > > > > > > However, reverting the commit would cause the frame buffer not
> > > > > > > to work on the Gen2 VM.
> > > > > > >
> > > > > > > Do you have any hints on what caused this issue?
> > > > > > >
> > > > > > > To reproduce the issue with kdump:
> > > > > > > - Install mainline kernel on an Azure Gen 2 VM and trigger a
> > > > > > > kdump
> > > > > > > - echo 1 > /proc/sys/kernel/sysrq
> > > > > > > - echo c > /proc/sysrq-trigger
> > > > > > >
> > > > > > > To reproduce the issue with executing kexec:
> > > > > > > - Install mainline kernel on Azure Gen 2 VM and use kexec
> > > > > > > - sudo kexec -l /boot/vmlinuz --initrd=/boot/initramfs.img
> > > > > > > --command- line="$( cat /proc/cmdline )"
> > > > > > > - sudo kexec -e
> > > > > > >
> > > > > > > Thank you,
> > > > > > > Thomas
> > > > >
> > > > > I will take a look, but it might be early next week before I can do so.
> > > > >
> > > >
> > > > Thank you, Michael for your help!
> > > >
> > > > > It looks like your soft lockup log below is from the kdump kernel
> > > > > (or the newly kexec'ed kernel). Can you confirm? Also, this looks like a
> > subset of the full log.
> > > >
> > > > Yes, the soft lockup log below is from the kdump kernel.
> > > >
> > > > > Do you have the full serial console log that you could email to
> > > > > me?  Seeing everything might be helpful. Of course, I'll try to
> > > > > repro the problem myself as well.
> > > >
> > > > I have attached the complete bootup and kdump kernel log.
> > > >
> > > > File: bootup_and_kdump.log
> > > > Line 1 ... 984 (bootup log)
> > > > Line 990       (kdump kernel booting up)
> > > > Line 1351      (soft lockup)
> > > >
> > > > Thank you,
> > > > Thomas
> > > >
> > >
> > > I have reproduced the problem in an Azure VM running Oracle Linux
> > > 9.4 with the 6.13.0 kernel. Interestingly, the problem does not occur
> > > in a VM running on a locally installed Hyper-V with Ubuntu 20.04 and
> > > the 6.13.0 kernel. There are several differences in the two
> > > environments:  the version of Hyper-V, the VM configuration, the Linux
> > > distro, and the .config file used to build the 6.13.0 kernel. I'll try
> > > to figure out what make the difference, and then the root cause.
> > >
> >
> > This has been a real bear to investigate. :-(  The key observation is that with
> > older kernel versions, the efifb driver does *not* try to load when running in
> > the kdump kernel, and everything works.
> > In newer kernels, the efifb driver *does* try to load, and it appears to hang.
> > (Actually, it is causing the VM to run very slowly. More on that in a minute.)
> >
> > I've bisected the kernel again, compensating for the fact that commit
> > 20ee2ae8c5899 is needed to make the Hyper-V frame buffer work. With that
> > compensation, the actual problematic commit is 2bebc3cd4870 (Revert
> > "firmware/sysfb: Clear screen_info state after consuming it").
> > Doing the revert causes screen_info.orig_video_isVGA to retain its value of
> > 0x70 (VIDEO_TYPE_EFI), which the kdump kernel picks up, causing it to load
> > the efifb driver.
> >
> > Then the question is why the efifb driver doesn't work in the kdump kernel.
> > Actually, it *does* work in many cases. I built the 6.13.0 kernel on the Oracle
> > Linux 9.4 system, and transferred the kernel image binary and module
> > binaries to an Ubuntu 20.04 VM in Azure. In that VM, the efifb driver is
> > loaded as part of the kdump kernel, and it doesn't cause any problems. But
> > there's an interesting difference. In the Oracle Linux
> > 9.4 VM, the efifb driver finds the framebuffer at 0x40000000, while on the
> > Ubuntu 20.04 VM, it finds the framebuffer at 0x40900000. This difference is
> > due to differences in how the screen_info variable gets setup in the two VMs.
> >
> > When the normal kernel starts in a freshly booted VM, Hyper-V provides the
> > EFI framebuffer at 0x40000000, and it works. But after the Hyper-V FB driver
> > or Hyper-V DRM driver has initialized, Linux has picked a different MMIO
> > address range and told Hyper-V to use the new address range (which often
> > starts at 0x40900000). A kexec does *not* reset Hyper-V's transition to the
> > new range, so when the efifb driver tries to use the framebuffer at
> > 0x40000000, the accesses trap to Hyper-V and probably fail or timeout (I'm
> > not sure of the details). After the guest does some number of these bad
> > references, Hyper-V considers itself to be under attack from an ill-behaving
> > guest, and throttles the guest so that it doesn't run for a few seconds. The
> > throttling repeats, and results in extremely slow running in the kdump kernel.
> >
> > Somehow in the Ubuntu 20.04 VM, the location of the frame buffer as stored
> > in screen_info.lfb_base gets updated to be 0x40900000. I haven't fully
> > debugged how that happens. But with that update, the efifb driver is using
> > the updated framebuffer address and it works. On the Oracle Linux 9.4
> > system, that update doesn't appear to happen, and the problem occurs.
> >
> > This in an interim update on the problem. I'm still investigating how
> > screen_info.lfb_base is set in the kdump kernel, and why it is different in the
> > Ubuntu 20.04 VM vs. in the Oracle Linux 9.4 VM. Once that is well
> > understood, we can contemplate how to fix the problem. Undoing the revert
> > that is commit 2bebc3cd4870 doesn't seem like the solution since the original
> > code there was reported to cause many other issues.
> > The solution focus will likely be on how to ensure the kdump kernel gets the
> > correct framebuffer address so the efifb driver works, since the framebuffer
> > address changing is a quirk of Hyper-V behavior.
> >
> > If anyone else has insight into what's going on here, please chime in.
> > What I've learned so far is still somewhat tentative.
> >
> > Michael