Re: oops in xe_bo_evict on 6.12.8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 10 Jan 2025, "Emil J Tywoniak" <emil@xxxxxxxxxxx> wrote:
> What's up gamers,
>
> hope this is the right place to report this oops which possibly is due
> to amdgpu interaction. The community guidelines link for this list
> (https://01.org/linuxgraphics/community) doesn't work. Feel free to
> redirect me if not, even to /dev/null. The Video(DRI - Intel) section
> on kernel bugzilla doesn't seem to get much life.

For the longest time, the fdo gitlab has been the place to report i915
[1] (and lately xe [2]) driver bugs, with a bug filing guide at [3].

However, the backtrace is all amdgpu? You only mention xe_bo_evict in
the subject.

Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx


BR,
Jani.



[1] https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/
[2] https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/
[3] https://drm.pages.freedesktop.org/intel-docs/how-to-file-i915-bugs.html


>
> I see there have been recent changes to things around bo eviction on
> xe and today I caught the following oops when spawning a second VS
> Code window in sway with the New Window command (Ctrl+Shift+N). VS
> Code was not running on XWayland. So far I haven't been able to
> reproduce this. I have amdgpu loaded as a fall back for my ryzen 7900X
> builtin graphics since I installed the funny GPU (Intel Arc B580 / BMG
> G21). I'm on Mesa 24.3.3.
>
> ------------[ cut here ]------------
> workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu]
> WARNING: CPU: 5 PID: 29199 at kernel/workqueue.c:3704 check_flush_dependency+0x10f/0x130
> Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq cmac algif_hash algif_skcipher af_alg nft_chain_nat xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype overlay af_packet bnep btusb btrtl btintel btbcm btmtk bluetooth mousedev cdc_acm joydev nls_iso8859_1 nls_cp437 vfat fat mei_gsc_proxy mei_gsc mei_me mei xt_conntrack ip6t_rpfilter mt7921e ipt_rpfilter mt7921_common mt792x_lib snd_hda_codec_hdmi mt76_connac_lib edac_mce_amd edac_core mt76 snd_hda_intel amd_atl intel_rapl_msr snd_intel_dspcfg xt_pkttype intel_rapl_common snd_intel_sdw_acpi crct10dif_pclmul xt_LOG mac80211 snd_usb_audio uvcvideo nf_log_syslog snd_usbmidi_lib crc32_pclmul snd_hda_codec xt_tcpudp polyval_clmulni videobuf2_vmalloc xe snd_ump polyval_generic uvc snd_hda_core ghash_clmulni_intel cfg80211 nft_compat snd_rawmidi sha512_ssse3 videobuf2_memops spd5118 sha256_ssse3 snd_seq_device videobuf2_v4l2 snd_hwdep r8169 sha1_ssse3 battery sp5100_tco videobuf2_common aesni_intel snd_pcm watchdog realtek gf128mul
>  crypto_simd mdio_devres videodev snd_timer cryptd libphy rfkill snd i2c_piix4 drm_gpuvm wmi_bmof rapl libarc4 led_class mc nf_tables i2c_smbus k10temp soundcore sch_fq_codel tpm_crb rtc_cmos evdev mac_hid tpm_tis gpio_amdpt tiny_power_button tpm_tis_core gpio_generic button uinput hid_xpadneo(O) ff_memless atkbd libps2 serio vivaldi_fmap loop xt_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter veth tun tap macvlan bridge stp llc kvm_amd ccp kvm fuse efi_pstore configfs nfnetlink efivarfs tpm libaescfb ecdh_generic ecc rng_core dmi_sysfs ip_tables x_tables autofs4 ext4 crc32c_generic mbcache jbd2 hid_generic usbhid hid ahci libahci xhci_pci libata nvme xhci_hcd scsi_mod nvme_core crc32c_intel scsi_common nvme_auth dm_mod dax amdgpu video wmi amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper drm_buddy drm_display_helper cec crc16
> CPU: 5 UID: 0 PID: 29199 Comm: kworker/u96:0 Tainted: G        W  O       6.12.8 #1-NixOS
> Tainted: [W]=WARN, [O]=OOT_MODULE
> Hardware name: Micro-Star International Co., Ltd. MS-7D75/MAG B650 TOMAHAWK WIFI (MS-7D75), BIOS 1.60 05/30/2023
> Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
> RIP: 0010:check_flush_dependency+0x10f/0x130
> Code: c0 f3 01 01 90 49 8b 45 18 48 8d b2 c0 00 00 00 48 8d 8b c0 00 00 00 49 89 e8 48 c7 c7 a0 c7 df b4 48 89 c2 e8 82 7e fd ff 90 <0f> 0b 90 90 e9 0a ff ff ff 80 3d 99 c0 f3 01 00 75 8f e9 42 ff ff
> RSP: 0018:ffff95dd9ef97c60 EFLAGS: 00010046
> RAX: 0000000000000000 RBX: ffff9265c01b8e00 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: ffffffffc0438c00 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff92681a13b200
> R13: ffff9265c94338c0 R14: 0000000000000001 R15: ffff9265c01bce00
> FS:  0000000000000000(0000) GS:ffff926cb7e80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000000050d6d0 CR3: 00000002a38d6000 CR4: 0000000000f50ef0
> PKRU: 55555554
> Call Trace:
>  <TASK>
>  ? check_flush_dependency+0x10f/0x130
>  ? __warn.cold+0x93/0xf6
>  ? check_flush_dependency+0x10f/0x130
>  ? report_bug+0x10d/0x150
>  ? srso_alias_return_thunk+0x5/0xfbef5
>  ? handle_bug+0x61/0xb0
>  ? exc_invalid_op+0x17/0x80
>  ? asm_exc_invalid_op+0x1a/0x20
>  ? __pfx_amdgpu_device_delay_enable_gfx_off+0x10/0x10 [amdgpu]
>  ? check_flush_dependency+0x10f/0x130
>  __flush_work+0x10c/0x320
>  cancel_delayed_work_sync+0x62/0x80
>  amdgpu_gfx_off_ctrl+0xb7/0x150 [amdgpu]
>  amdgpu_ring_alloc+0x40/0x70 [amdgpu]
>  amdgpu_ib_schedule+0xf0/0x750 [amdgpu]
>  amdgpu_job_run+0x8e/0x200 [amdgpu]
>  drm_sched_run_job_work+0x283/0x420 [gpu_sched]
>  process_one_work+0x18a/0x350
>  worker_thread+0x235/0x370
>  ? __pfx_worker_thread+0x10/0x10
>  ? __pfx_worker_thread+0x10/0x10
>  kthread+0xcd/0x100
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork+0x31/0x50
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork_asm+0x1a/0x30
>  </TASK>
> ---[ end trace 0000000000000000 ]---
>
> I hope this tells you something. I'm willing to switch to some cutting
> edge kernel commit and report back if I get an oops again, so feel
> free which remote and commit I should go get, or any other
> troubleshooting steps I could follow.
>
> Thanks for all your hard work,
>
> Emil J. Tywoniak (widlarizer)

-- 
Jani Nikula, Intel




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux