On Thu, Aug 15, 2024 at 1:48 PM Christian Heusel <christian@xxxxxxxxx> wrote: > > Hello Zack, > > the user rdkehn (in CC) on the Arch Linux Forums reports that after > updating to the 6.10.4 stable kernel inside of their VM Workstation the > driver crashes with the error attached below. This error is also present > on the latest mainline release 6.11-rc3. > > We have bisected the issue together down to the following commit: > > d6667f0ddf46 ("drm/vmwgfx: Fix handling of dumb buffers") > > Reverting this commit on top of 6.11-rc3 fixes the issue. > > While we were still debugging the issue Brad (also CC'ed) messaged me > that they were seeing similar failures in their ESXi based test > pipelines except for one box that was running on legacy BIOS (so maybe > that is relevant). They noticed this because they had set panic_on_warn. > > Cheers, > Chris > > --- > > #regzbot introduced: d6667f0ddf46 > #regzbot title: drm/vmwgfx: driver crashes due to command buffer error > #regzbot link: https://bbs.archlinux.org/viewtopic.php?id=298491 > > --- > > dmesg snippet: > [ 13.297084] ------------[ cut here ]------------ > [ 13.297086] Command buffer error. > [ 13.297139] WARNING: CPU: 0 PID: 186 at drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c:399 vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx] > [ 13.297160] Modules linked in: uas usb_storage hid_generic usbhid mptspi sr_mod cdrom scsi_transport_spi vmwgfx serio_raw mptscsih ata_generic atkbd drm_ttm_helper libps2 pata_acpi vivaldi_fmap ttm mptbase crc32c_intel xhci_pci intel_agp xhci_pci_renesas ata_piix intel_gtt i8042 serio > [ 13.297172] CPU: 0 PID: 186 Comm: irq/16-vmwgfx Not tainted 6.10.4-arch2-1 #1 517ed45cc9c4492ee5d5bfc2d2fe6ef1f2e7a8eb > [ 13.297174] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 > [ 13.297175] RIP: 0010:vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx] > [ 13.297186] Code: 01 00 01 e8 ba 8c 4f f9 0f 0b 4c 89 ff e8 40 fb ff ff e9 9d fe ff ff 48 c7 c7 99 d9 3f c0 c6 05 52 2f 01 00 01 e8 98 8c 4f f9 <0f> 0b e9 1f fe ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 > [ 13.297187] RSP: 0018:ffffb9c1805e3d78 EFLAGS: 00010282 > [ 13.297188] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000003 > [ 13.297189] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000001 > [ 13.297190] RBP: ffff907fc8274c98 R08: 0000000000000000 R09: ffffb9c1805e3bf8 > [ 13.297191] R10: ffff9086dbdfffa8 R11: 0000000000000003 R12: ffff907fc4db5b00 > [ 13.297192] R13: ffff907fc83fd318 R14: ffff907fc8274c88 R15: ffff907fc83fd300 > [ 13.297193] FS: 0000000000000000(0000) GS:ffff9086dbe00000(0000) knlGS:0000000000000000 > [ 13.297194] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 13.297194] CR2: 0000774dc57671ca CR3: 00000006b9e20005 CR4: 00000000003706f0 > [ 13.297196] Call Trace: > [ 13.297198] <TASK> > [ 13.297199] ? vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9] > [ 13.297209] ? __warn.cold+0x8e/0xe8 > [ 13.297211] ? vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9] > [ 13.297221] ? report_bug+0xff/0x140 > [ 13.297222] ? console_unlock+0x84/0x130 > [ 13.297225] ? handle_bug+0x3c/0x80 > [ 13.297226] ? exc_invalid_op+0x17/0x70 > [ 13.297227] ? asm_exc_invalid_op+0x1a/0x20 > [ 13.297230] ? vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9] > [ 13.297238] ? vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9] > [ 13.297245] vmw_cmdbuf_man_process+0x5d/0x100 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9] > [ 13.297253] vmw_cmdbuf_irqthread+0x25/0x30 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9] > [ 13.297261] vmw_thread_fn+0x3a/0x70 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9] > [ 13.297271] irq_thread_fn+0x20/0x60 > [ 13.297273] irq_thread+0x18a/0x270 > [ 13.297274] ? __pfx_irq_thread_fn+0x10/0x10 > [ 13.297276] ? __pfx_irq_thread_dtor+0x10/0x10 > [ 13.297277] ? __pfx_irq_thread+0x10/0x10 > [ 13.297278] kthread+0xcf/0x100 > [ 13.297281] ? __pfx_kthread+0x10/0x10 > [ 13.297282] ret_from_fork+0x31/0x50 > [ 13.297285] ? __pfx_kthread+0x10/0x10 > [ 13.297286] ret_from_fork_asm+0x1a/0x30 > [ 13.297288] </TASK> > [ 13.297289] ---[ end trace 0000000000000000 ]--- Hi, Christian. Thanks for the report! So just to be clear vmwgfx doesn't crash, but it shows a warning and the kernel has been compiled with panic on warning which is actually what panics, right? I haven't seen this on any of our systems so I'm guessing the affected systems aren't running gnome/kde? Is there any chance I could see the full "journalctl -b" log and the vmware.log file associated with those warnings? They could give me some clues on how to reproduce this. z