Hi, at weekend I catched another problem. I noted my computer starts hang after launching Steam and Google Chrome. In the kernel log I saw such backtrace: [ 90.002283] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 90.002292] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 90.002296] CPU: 12 PID: 3499 Comm: chrome:cs0 Tainted: G B W L 6.4.0-rc7-07-a2848d08742c8e8494675892c02c0d22acbe3cf8+ #124 [ 90.002299] Hardware name: ASUSTeK COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.331 02/24/2023 [ 90.002301] RIP: 0010:ttm_bo_evict_swapout_allowable+0x322/0x5e0 [ttm] [ 90.002313] Code: b6 04 02 48 89 ea 83 e2 07 38 d0 7f 08 84 c0 0f 85 e8 01 00 00 4c 89 e2 c6 45 00 00 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 04 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 ca 01 00 00 [ 90.002316] RSP: 0018:ffffc9000703ee08 EFLAGS: 00010256 [ 90.002319] RAX: dffffc0000000000 RBX: ffff888180ac1858 RCX: ffffc9000703ee90 [ 90.002321] RDX: 0000000000000000 RSI: ffffc9000703f228 RDI: ffff888180ac1ab4 [ 90.002323] RBP: ffffc9000703ee90 R08: 0000000000000000 R09: ffffc9000703eed0 [ 90.002324] R10: fffff52000e07db3 R11: ffffffffb17dde80 R12: 0000000000000000 [ 90.002326] R13: ffffc9000703f228 R14: ffffc9000703eed0 R15: ffff888180ac1858 [ 90.002328] FS: 00007f77461fe6c0(0000) GS:ffff888f9c800000(0000) knlGS:0000000000000000 [ 90.002330] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 90.002332] CR2: 00007f773549c000 CR3: 000000024213e000 CR4: 0000000000750ee0 [ 90.002334] PKRU: 55555554 [ 90.002335] Call Trace: [ 90.002337] <TASK> [ 90.002339] ? die_addr+0x40/0xa0 [ 90.002346] ? exc_general_protection+0x159/0x240 [ 90.002352] ? asm_exc_general_protection+0x26/0x30 [ 90.002357] ? ttm_bo_evict_swapout_allowable+0x322/0x5e0 [ttm] [ 90.002365] ? ttm_bo_evict_swapout_allowable+0x42e/0x5e0 [ttm] [ 90.002373] ttm_bo_swapout+0x134/0x7f0 [ttm] [ 90.002383] ? __pfx_ttm_bo_swapout+0x10/0x10 [ttm] [ 90.002391] ? lock_acquire+0x44d/0x4f0 [ 90.002398] ? ttm_device_swapout+0xa5/0x260 [ttm] [ 90.002412] ? lock_acquired+0x355/0xa00 [ 90.002416] ? do_raw_spin_trylock+0xb6/0x190 [ 90.002421] ? __pfx_lock_acquired+0x10/0x10 [ 90.002426] ? ttm_global_swapout+0x25/0x210 [ttm] [ 90.002442] ttm_device_swapout+0x198/0x260 [ttm] [ 90.002456] ? __pfx_ttm_device_swapout+0x10/0x10 [ttm] [ 90.002472] ttm_global_swapout+0x75/0x210 [ttm] [ 90.002486] ttm_tt_populate+0x187/0x3f0 [ttm] [ 90.002501] ttm_bo_handle_move_mem+0x437/0x590 [ttm] [ 90.002517] ttm_bo_validate+0x275/0x430 [ttm] [ 90.002530] ? __pfx_ttm_bo_validate+0x10/0x10 [ttm] [ 90.002544] ? kasan_save_stack+0x33/0x60 [ 90.002550] ? kasan_set_track+0x25/0x30 [ 90.002554] ? __kasan_kmalloc+0x8f/0xa0 [ 90.002558] ? amdgpu_gtt_mgr_new+0x81/0x420 [amdgpu] [ 90.003023] ? ttm_resource_alloc+0xf6/0x220 [ttm] [ 90.003038] amdgpu_bo_pin_restricted+0x2dd/0x8b0 [amdgpu] [ 90.003210] ? __x64_sys_ioctl+0x131/0x1a0 [ 90.003210] ? do_syscall_64+0x60/0x90 [ 90.003210] ? __pfx_amdgpu_bo_pin_restricted+0x10/0x10 [amdgpu] [ 90.003210] ? unmap_mapping_range+0xb6/0x250 [ 90.003210] ? __pfx___might_resched+0x10/0x10 [ 90.003210] ? lock_acquired+0x355/0xa00 [ 90.003210] ? __down_read_trylock+0x1be/0x3a0 [ 90.003210] dma_buf_map_attachment+0x1dd/0x560 [ 90.003210] ? rcu_is_watching+0x15/0xb0 [ 90.003210] amdgpu_bo_move+0x1227/0x1830 [amdgpu] [ 90.003210] ? lock_release+0x4ec/0xba0 [ 90.003210] ? ttm_bo_add_move_fence.isra.0+0x22/0x290 [ttm] [ 90.003210] ? rcu_is_watching+0x15/0xb0 [ 90.003210] ? __pfx_amdgpu_bo_move+0x10/0x10 [amdgpu] [ 90.003210] ? dma_resv_reserve_fences+0xe8/0x7f0 [ 90.003210] ? unmap_mapping_range+0xe3/0x250 [ 90.003210] ? __pfx_dma_resv_reserve_fences+0x10/0x10 [ 90.003210] ? _raw_spin_unlock+0x2d/0x50 [ 90.003210] ? ttm_bo_add_move_fence.isra.0+0x12b/0x290 [ttm] [ 90.003210] ttm_bo_handle_move_mem+0x244/0x590 [ttm] [ 90.003210] ttm_bo_validate+0x275/0x430 [ttm] [ 90.003210] ? __pfx_ttm_bo_validate+0x10/0x10 [ttm] [ 90.003210] ? lock_acquired+0x355/0xa00 [ 90.003210] amdgpu_cs_bo_validate+0x25a/0xcb0 [amdgpu] [ 90.003210] ? __kmalloc+0xe1/0x160 [ 90.003210] ? amdgpu_vm_validate_pt_bos+0x372/0x670 [amdgpu] [ 90.003210] ? __pfx_amdgpu_cs_bo_validate+0x10/0x10 [amdgpu] [ 90.003210] ? __pfx___mutex_unlock_slowpath+0x10/0x10 [ 90.003210] amdgpu_cs_list_validate+0x26c/0x390 [amdgpu] [ 90.003210] ? __pfx_amdgpu_cs_list_validate+0x10/0x10 [amdgpu] [ 90.003210] ? __pfx_amdgpu_cs_bo_validate+0x10/0x10 [amdgpu] [ 90.003210] ? seqcount_lockdep_reader_access.constprop.0+0xa5/0xb0 [ 90.003210] ? trace_hardirqs_on+0x16/0x100 [ 90.003210] amdgpu_cs_ioctl+0x2207/0x55e0 [amdgpu] [ 90.003210] ? rcu_is_watching+0x15/0xb0 [ 90.003210] ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu] [ 90.003210] ? finish_task_switch.isra.0+0x22b/0xc20 [ 90.003210] ? rcu_is_watching+0x15/0xb0 [ 90.003210] ? __switch_to+0x413/0xde0 [ 90.003210] ? __pfx_lock_release+0x10/0x10 [ 90.003210] ? rcu_is_watching+0x15/0xb0 [ 90.003210] ? lock_acquire+0x44d/0x4f0 [ 90.003210] ? rcu_is_watching+0x15/0xb0 [ 90.003210] ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu] [ 90.003210] drm_ioctl_kernel+0x1fc/0x3d0 [ 90.003210] ? __pfx___might_resched+0x10/0x10 [ 90.003210] ? __pfx_drm_ioctl_kernel+0x10/0x10 [ 90.003210] drm_ioctl+0x4c5/0xaa0 [ 90.003210] ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu] [ 90.003210] ? __pfx_drm_ioctl+0x10/0x10 [ 90.003210] ? lock_release+0x4ec/0xba0 [ 90.003210] ? rcu_is_watching+0x15/0xb0 [ 90.003210] ? _raw_spin_unlock_irqrestore+0x66/0x80 [ 90.003210] ? trace_hardirqs_on+0x16/0x100 [ 90.003210] ? _raw_spin_unlock_irqrestore+0x4f/0x80 [ 90.003210] amdgpu_drm_ioctl+0xd2/0x1b0 [amdgpu] [ 90.003210] __x64_sys_ioctl+0x131/0x1a0 [ 90.003210] do_syscall_64+0x60/0x90 [ 90.003210] ? do_syscall_64+0x6c/0x90 [ 90.003210] ? do_syscall_64+0x6c/0x90 [ 90.003210] ? trace_hardirqs_on_prepare+0xe3/0x100 [ 90.003210] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 90.003210] RIP: 0033:0x7f775a9113ad [ 90.003210] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 [ 90.003210] RSP: 002b:00007f77461fd1d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 90.003210] RAX: ffffffffffffffda RBX: 00007f77461fd368 RCX: 00007f775a9113ad [ 90.003210] RDX: 00007f77461fd2a0 RSI: 00000000c0186444 RDI: 000000000000001c [ 90.003210] RBP: 00007f77461fd220 R08: 00007f77461fd3d0 R09: 00007f77461fd270 [ 90.003210] R10: 00001a94000b2900 R11: 0000000000000246 R12: 00007f77461fd2a0 [ 90.003210] R13: 00000000c0186444 R14: 000000000000001c R15: 00007f77461fd368 [ 90.003210] </TASK> [ 90.003210] Modules linked in: uinput rfcomm snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep sunrpc snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi binfmt_misc snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils mt7921e mt7921_common intel_rapl_msr intel_rapl_common vfat mt76_connac_lib fat btusb edac_mce_amd mt76 btrtl btbcm snd_hda_intel btintel snd_intel_dspcfg snd_soc_core snd_intel_sdw_acpi kvm_amd btmtk snd_hda_codec mac80211 snd_compress snd_hda_core bluetooth ac97_bus snd_pcm_dmaengine kvm snd_hwdep snd_seq snd_pci_ps snd_rpl_pci_acp6x snd_seq_device libarc4 snd_pci_acp6x irqbypass snd_pci_acp5x snd_pcm cfg80211 rapl asus_nb_wmi snd_rn_pci_acp3x snd_timer wmi_bmof snd_acp_config snd_soc_acpi snd pcspkr [ 90.003210] acpi_cpufreq snd_pci_acp3x i2c_piix4 k10temp soundcore amd_pmc asus_wireless joydev loop zram amdgpu i2c_algo_bit hid_asus drm_ttm_helper crct10dif_pclmul asus_wmi ttm ledtrig_audio crc32_pclmul drm_suballoc_helper sparse_keymap crc32c_intel iommu_v2 polyval_clmulni platform_profile drm_buddy polyval_generic gpu_sched nvme rfkill ucsi_acpi hid_multitouch drm_display_helper ghash_clmulni_intel nvme_core serio_raw typec_ucsi sha512_ssse3 r8169 ccp cec nvme_common sp5100_tco typec video i2c_hid_acpi i2c_hid wmi ip6_tables ip_tables fuse [ 90.008560] ---[ end trace 0000000000000000 ]--- [ 90.008565] RIP: 0010:ttm_bo_evict_swapout_allowable+0x322/0x5e0 [ttm] [ 90.008580] Code: b6 04 02 48 89 ea 83 e2 07 38 d0 7f 08 84 c0 0f 85 e8 01 00 00 4c 89 e2 c6 45 00 00 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 04 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 ca 01 00 00 [ 90.008583] RSP: 0018:ffffc9000703ee08 EFLAGS: 00010256 [ 90.008587] RAX: dffffc0000000000 RBX: ffff888180ac1858 RCX: ffffc9000703ee90 [ 90.008591] RDX: 0000000000000000 RSI: ffffc9000703f228 RDI: ffff888180ac1ab4 [ 90.008594] RBP: ffffc9000703ee90 R08: 0000000000000000 R09: ffffc9000703eed0 [ 90.008597] R10: fffff52000e07db3 R11: ffffffffb17dde80 R12: 0000000000000000 [ 90.008600] R13: ffffc9000703f228 R14: ffffc9000703eed0 R15: ffff888180ac1858 [ 90.008603] FS: 00007f77461fe6c0(0000) GS:ffff888f9c800000(0000) knlGS:0000000000000000 [ 90.008606] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 90.008609] CR2: 00007f773549c000 CR3: 000000024213e000 CR4: 0000000000750ee0 [ 90.008613] PKRU: 55555554 [ 90.008616] note: chrome:cs0[3499] exited with preempt_count 1 Bisect said that commit a2848d08742c8e8494675892c02c0d22acbe3cf8 is culprit here. commit a2848d08742c8e8494675892c02c0d22acbe3cf8 Author: Christian König <christian.koenig@xxxxxxx> Date: Fri Jul 7 11:25:00 2023 +0200 drm/ttm: never consider pinned BOs for eviction&swap There is a small window where we have already incremented the pin count but not yet moved the bo from the lru to the pinned list. Signed-off-by: Christian König <christian.koenig@xxxxxxx> Reported-by: Pelloux-Prayer, Pierre-Eric <Pierre-eric.Pelloux-prayer@xxxxxxx> Tested-by: Pelloux-Prayer, Pierre-Eric <Pierre-eric.Pelloux-prayer@xxxxxxx> Acked-by: Alex Deucher <alexander.deucher@xxxxxxx> Cc: stable@xxxxxxxxxxxxxxx Link: https://patchwork.freedesktop.org/patch/msgid/20230707120826.3701-1-christian.koenig@xxxxxxx drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++++ 1 file changed, 6 insertions(+) I attached here a full bisect log, all kernel logs from all bisect steps and my kernel build config. Is there anything else I can help here? -- Best Regards, Mike Gavrilov.
git bisect start # status: waiting for both good and bad commits # bad: [831fe284d8275987596b7d640518dddba5735f61] Merge tag 'spi-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi git bisect bad 831fe284d8275987596b7d640518dddba5735f61 # status: waiting for good commit(s), bad commit known # good: [3f01e9fed8454dcd89727016c3e5b2fbb8f8e50c] Merge tag 'linux-watchdog-6.5-rc2' of git://www.linux-watchdog.org/linux-watchdog git bisect good 3f01e9fed8454dcd89727016c3e5b2fbb8f8e50c # 01 - good: [b1983d427a53911ea71ba621d4bf994ae22b1536] Merge tag 'net-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net git bisect good b1983d427a53911ea71ba621d4bf994ae22b1536 # 02 - bad: [ec17f16432058e1406c763a81acfc1394578bc8c] Merge tag 'io_uring-6.5-2023-07-14' of git://git.kernel.dk/linux git bisect bad ec17f16432058e1406c763a81acfc1394578bc8c # 03 - bad: [864e029fea2b8e6583e026a6f93e8933ba626d42] Merge tag 'drm-intel-fixes-2023-07-13' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes git bisect bad 864e029fea2b8e6583e026a6f93e8933ba626d42 # 04 - good: [06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5] Linux 6.5-rc1 git bisect good 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5 # 05 - good: [7f34e01f77f811ecb2ef83e60301b38cf89af466] accel/ivpu: Clear specific interrupt status bits on C0 git bisect good 7f34e01f77f811ecb2ef83e60301b38cf89af466 # 06 - bad: [c177872cb056e0b499af4717d8d1977017fd53df] drm/nouveau/disp/g94: enable HDMI git bisect bad c177872cb056e0b499af4717d8d1977017fd53df # 07 - bad: [a2848d08742c8e8494675892c02c0d22acbe3cf8] drm/ttm: never consider pinned BOs for eviction&swap git bisect bad a2848d08742c8e8494675892c02c0d22acbe3cf8 # 08 - good: [15008052b34efaa86c1d56190ac73c4bf8c462f9] drm/fbdev-dma: Fix documented default preferred_bpp value git bisect good 15008052b34efaa86c1d56190ac73c4bf8c462f9 # first bad commit: [a2848d08742c8e8494675892c02c0d22acbe3cf8] drm/ttm: never consider pinned BOs for eviction&swap
<<attachment: dmesg-all-bisect-steps.zip>>
Attachment:
.config.zip
Description: Zip archive