Re: Running ttm_device_test leads to list_add corruption. prev->next should be next (ffffffffc05cd428), but was 6b6b6b6b6b6b6b6b. (prev=ffffa0b1a5c034f0) (kernel 6.7.5)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 20 Feb 2024 16:12:44 +0700
Bagas Sanjaya <bagasdotme@xxxxxxxxx> wrote:

> > [    0.000000] Linux version 6.7.5-Zen3 (root@supah) (gcc (Gentoo 13.2.1_p20240113-r1 p12) 13.2.1 20240113, GNU ld (Gentoo 2.41 p5) 2.41.0) #1 SMP Mon Feb 19 12:44:46 -00 2024  
> 
> Is it vanilla kernel (i.e. no patches applied)? Can you also check current
> mainline (v6.8-rc5)?
> 
> Confused...

Yes, this kernel was built from upstream git stable sources, no additional patches.

It's just that I use my own custom kernel .config that's why I attached it. But the kernel should run in qemu too.

Also the issue is reproducible on v6.8-rc5 (dmesg attached).

Additionally I tried 'modprobe -v ttm-device-test' on v6.8-rc5 with KASAN enabled instead of KFENCE, same kernel .config otherwise. With KASAN I get a different dmesg and the test completes with a failure. And I don't seem to get memory corruption afterwards:

[...]
KTAP version 1
1..1
    KTAP version 1
    # Subtest: ttm_device
    # module: ttm_device_test
    1..5
    ok 1 ttm_device_init_basic
    # ttm_device_init_multiple: ASSERTION FAILED at drivers/gpu/drm/ttm/tests/ttm_device_test.c:68
    Expected list_count_nodes(&ttm_devs[0].device_list) == num_dev, but
        list_count_nodes(&ttm_devs[0].device_list) == 4 (0x4)
        num_dev == 3 (0x3)
    not ok 2 ttm_device_init_multiple
    ok 3 ttm_device_fini_basic
------------[ cut here ]------------
WARNING: CPU: 5 PID: 2146 at drivers/gpu/drm/ttm/ttm_device.c:206 ttm_device_init+0x23/0x281 [ttm]
Modules linked in: ttm_device_test ttm_kunit_helpers drm_kunit_helpers kunit rfkill dm_crypt nhpoly1305_avx2 nhpoly1305 chacha_generic chacha_x86_64 libchacha adiantum libpoly1305 algif_skcipher amdgpu wmi_bmof amd64_edac edac_mce_amd snd_hda_codec_hdmi input_leds snd_hda_intel amdxcp snd_intel_dspcfg kvm_amd snd_hda_codec snd_hwdep snd_hda_core mfd_core snd_pcm gpu_sched snd_timer video drm_suballoc_helper snd i2c_algo_bit drm_ttm_helper gpio_amdpt soundcore ttm drm_exec button drm_display_helper rapl gpio_generic wmi drm_buddy k10temp evdev joydev lz4 lz4_compress lz4_decompress sg zram nct6775 nct6775_core hwmon_vid hwmon loop configfs hid_generic usbhid hid sha512_ssse3 sha512_generic sha256_ssse3 sha1_ssse3 sha1_generic aesni_intel xhci_pci libaes xhci_hcd crypto_simd ccp cryptd usbcore usb_common sunrpc dm_mod pkcs8_key_parser efivarfs
CPU: 5 PID: 2146 Comm: kunit_try_catch Tainted: G    B            N 6.8.0-rc5-Zen3 #3
Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS P3.40 01/18/2024
RIP: 0010:ttm_device_init+0x23/0x281 [ttm]
Code: 31 ff e9 fa e4 d5 e6 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 53 48 83 ec 18 8b 44 24 50 48 89 14 24 89 44 24 0c 4d 85 c0 75 0c <0f> 0b bd ea ff ff ff e9 2f 02 00 00 48 89 fb 49 89 f7 49 89 ce 4d
RSP: 0018:ffffc9000611fcf8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888190184000 RCX: ffff888100651b18
RDX: ffff88817d4a6400 RSI: ffffffffc2033d40 RDI: ffff888106abc000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff888106abc000 R14: 0000000000000000 R15: ffff888100651b18
FS:  0000000000000000(0000) GS:ffff8887de880000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007feb67e03b20 CR3: 00000001608ac000 CR4: 0000000000b50ef0
Call Trace:
 <TASK>
 ? __warn+0x113/0x14c
 ? ttm_device_init+0x23/0x281 [ttm]
 ? report_bug+0x1b3/0x229
 ? ttm_device_init+0x23/0x281 [ttm]
 ? handle_bug+0x3c/0x7c
 ? exc_invalid_op+0x17/0x46
 ? asm_exc_invalid_op+0x1a/0x20
 ? ttm_device_init+0x23/0x281 [ttm]
 ? local_clock_noinstr+0xc/0xa8
 ttm_device_kunit_init+0xf1/0x10f [ttm_kunit_helpers]
 ttm_device_init_no_vma_man+0x145/0x1e7 [ttm_device_test]
 ? ttm_device_init_pools+0x61e/0x61e [ttm_device_test]
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? timekeeping_get_ns+0x60/0xf8
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? ktime_get_ts64+0x68/0x109
 kunit_try_run_case+0x269/0x3cc [kunit]
 ? kunit_try_run_case_cleanup+0xc2/0xc2 [kunit]
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? do_raw_spin_unlock+0x5d/0x1b6
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? kunit_try_catch_throw+0x6a/0x6a [kunit]
 ? kunit_try_run_case_cleanup+0xc2/0xc2 [kunit]
 kunit_generic_run_threadfn_adapter+0x54/0x86 [kunit]
 kthread+0x25e/0x26d
 ? kthread_complete_and_exit+0x1f/0x1f
 ret_from_fork+0x23/0x54
 ? kthread_complete_and_exit+0x1f/0x1f
 ret_from_fork_asm+0x11/0x20
 </TASK>
---[ end trace 0000000000000000 ]---
    ok 4 ttm_device_init_no_vma_man
        KTAP version 1
        # Subtest: ttm_device_init_pools
        ok 1 No DMA allocations, no DMA32 required
        ok 2 DMA allocations, DMA32 required
        ok 3 No DMA allocations, DMA32 required
        ok 4 DMA allocations, no DMA32 required
    # ttm_device_init_pools: pass:4 fail:0 skip:0 total:4
    ok 5 ttm_device_init_pools
# ttm_device: pass:4 fail:1 skip:0 total:5
# Totals: pass:7 fail:1 skip:0 total:8
not ok 1 ttm_device
[...]


Regards,
Erhard

Attachment: dmesg_68-rc5_zen3
Description: Binary data


[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux