Re: Running ttm_device_test leads to list_add corruption. prev->next should be next (ffffffffc05cd428), but was 6b6b6b6b6b6b6b6b. (prev=ffffa0b1a5c034f0) (kernel 6.7.5)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Erhard,

Am 20.02.24 um 13:45 schrieb Erhard Furtner:
On Tue, 20 Feb 2024 16:12:44 +0700
Bagas Sanjaya <bagasdotme@xxxxxxxxx> wrote:

[    0.000000] Linux version 6.7.5-Zen3 (root@supah) (gcc (Gentoo 13.2.1_p20240113-r1 p12) 13.2.1 20240113, GNU ld (Gentoo 2.41 p5) 2.41.0) #1 SMP Mon Feb 19 12:44:46 -00 2024
Is it vanilla kernel (i.e. no patches applied)? Can you also check current
mainline (v6.8-rc5)?

Confused...
Yes, this kernel was built from upstream git stable sources, no additional patches.

It's just that I use my own custom kernel .config that's why I attached it. But the kernel should run in qemu too.

Yeah and that's probably the problem. The test is not supposed to be compiled and executed on bare metal, but rather just as unit test through user mode Linux.

We probably don't check that correctly in the kconfig for some reason. Can you provide your .config file?

Thanks,
Christian.


Also the issue is reproducible on v6.8-rc5 (dmesg attached).

Additionally I tried 'modprobe -v ttm-device-test' on v6.8-rc5 with KASAN enabled instead of KFENCE, same kernel .config otherwise. With KASAN I get a different dmesg and the test completes with a failure. And I don't seem to get memory corruption afterwards:

[...]
KTAP version 1
1..1
     KTAP version 1
     # Subtest: ttm_device
     # module: ttm_device_test
     1..5
     ok 1 ttm_device_init_basic
     # ttm_device_init_multiple: ASSERTION FAILED at drivers/gpu/drm/ttm/tests/ttm_device_test.c:68
     Expected list_count_nodes(&ttm_devs[0].device_list) == num_dev, but
         list_count_nodes(&ttm_devs[0].device_list) == 4 (0x4)
         num_dev == 3 (0x3)
     not ok 2 ttm_device_init_multiple
     ok 3 ttm_device_fini_basic
------------[ cut here ]------------
WARNING: CPU: 5 PID: 2146 at drivers/gpu/drm/ttm/ttm_device.c:206 ttm_device_init+0x23/0x281 [ttm]
Modules linked in: ttm_device_test ttm_kunit_helpers drm_kunit_helpers kunit rfkill dm_crypt nhpoly1305_avx2 nhpoly1305 chacha_generic chacha_x86_64 libchacha adiantum libpoly1305 algif_skcipher amdgpu wmi_bmof amd64_edac edac_mce_amd snd_hda_codec_hdmi input_leds snd_hda_intel amdxcp snd_intel_dspcfg kvm_amd snd_hda_codec snd_hwdep snd_hda_core mfd_core snd_pcm gpu_sched snd_timer video drm_suballoc_helper snd i2c_algo_bit drm_ttm_helper gpio_amdpt soundcore ttm drm_exec button drm_display_helper rapl gpio_generic wmi drm_buddy k10temp evdev joydev lz4 lz4_compress lz4_decompress sg zram nct6775 nct6775_core hwmon_vid hwmon loop configfs hid_generic usbhid hid sha512_ssse3 sha512_generic sha256_ssse3 sha1_ssse3 sha1_generic aesni_intel xhci_pci libaes xhci_hcd crypto_simd ccp cryptd usbcore usb_common sunrpc dm_mod pkcs8_key_parser efivarfs
CPU: 5 PID: 2146 Comm: kunit_try_catch Tainted: G    B            N 6.8.0-rc5-Zen3 #3
Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS P3.40 01/18/2024
RIP: 0010:ttm_device_init+0x23/0x281 [ttm]
Code: 31 ff e9 fa e4 d5 e6 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 53 48 83 ec 18 8b 44 24 50 48 89 14 24 89 44 24 0c 4d 85 c0 75 0c <0f> 0b bd ea ff ff ff e9 2f 02 00 00 48 89 fb 49 89 f7 49 89 ce 4d
RSP: 0018:ffffc9000611fcf8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888190184000 RCX: ffff888100651b18
RDX: ffff88817d4a6400 RSI: ffffffffc2033d40 RDI: ffff888106abc000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff888106abc000 R14: 0000000000000000 R15: ffff888100651b18
FS:  0000000000000000(0000) GS:ffff8887de880000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007feb67e03b20 CR3: 00000001608ac000 CR4: 0000000000b50ef0
Call Trace:
  <TASK>
  ? __warn+0x113/0x14c
  ? ttm_device_init+0x23/0x281 [ttm]
  ? report_bug+0x1b3/0x229
  ? ttm_device_init+0x23/0x281 [ttm]
  ? handle_bug+0x3c/0x7c
  ? exc_invalid_op+0x17/0x46
  ? asm_exc_invalid_op+0x1a/0x20
  ? ttm_device_init+0x23/0x281 [ttm]
  ? local_clock_noinstr+0xc/0xa8
  ttm_device_kunit_init+0xf1/0x10f [ttm_kunit_helpers]
  ttm_device_init_no_vma_man+0x145/0x1e7 [ttm_device_test]
  ? ttm_device_init_pools+0x61e/0x61e [ttm_device_test]
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? timekeeping_get_ns+0x60/0xf8
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? ktime_get_ts64+0x68/0x109
  kunit_try_run_case+0x269/0x3cc [kunit]
  ? kunit_try_run_case_cleanup+0xc2/0xc2 [kunit]
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? do_raw_spin_unlock+0x5d/0x1b6
  ? srso_alias_return_thunk+0x5/0xfbef5
  ? kunit_try_catch_throw+0x6a/0x6a [kunit]
  ? kunit_try_run_case_cleanup+0xc2/0xc2 [kunit]
  kunit_generic_run_threadfn_adapter+0x54/0x86 [kunit]
  kthread+0x25e/0x26d
  ? kthread_complete_and_exit+0x1f/0x1f
  ret_from_fork+0x23/0x54
  ? kthread_complete_and_exit+0x1f/0x1f
  ret_from_fork_asm+0x11/0x20
  </TASK>
---[ end trace 0000000000000000 ]---
     ok 4 ttm_device_init_no_vma_man
         KTAP version 1
         # Subtest: ttm_device_init_pools
         ok 1 No DMA allocations, no DMA32 required
         ok 2 DMA allocations, DMA32 required
         ok 3 No DMA allocations, DMA32 required
         ok 4 DMA allocations, no DMA32 required
     # ttm_device_init_pools: pass:4 fail:0 skip:0 total:4
     ok 5 ttm_device_init_pools
# ttm_device: pass:4 fail:1 skip:0 total:5
# Totals: pass:7 fail:1 skip:0 total:8
not ok 1 ttm_device
[...]


Regards,
Erhard




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux