On Mon, Jul 29, 2019 at 11:32 AM Christian König <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > > > Is this a known issue? > No, that looks like a new one to me. > > Is that somehow reproducible? I tried finding a reliable reproducer (only Vulkan CTS runs uncommonly caught it), but could not find anything better. However this issue seems to be fixed with one of the following patches from drm-misc-fixes: "drm/ttm: fix handling in ttm_bo_add_mem_to_lru" "drm/ttm: fix busy reference in ttm_mem_evict_first" I haven't seen the issue in 100 CTS runs. Thanks, Bas > > Christian. > > Am 29.07.19 um 10:14 schrieb Bas Nieuwenhuizen: > > Hi all, > > > > I have a TTM refcount issue: > > > > [173774.309968] ------------[ cut here ]------------ > > [173774.309970] kernel BUG at drivers/gpu/drm/ttm/ttm_bo.c:202! > > [173774.309982] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI > > [173774.309985] CPU: 13 PID: 128214 Comm: kworker/13:2 Not tainted > > 5.2.0-rc1-g3f2e519b0974 #10 > > [173774.309986] Hardware name: To Be Filled By O.E.M. To Be Filled By > > O.E.M./X399 Taichi, BIOS P1.50 09/05/2017 > > [173774.309995] Workqueue: events ttm_bo_delayed_workqueue [ttm] > > [173774.310000] RIP: 0010:ttm_bo_ref_bug+0x5/0x10 [ttm] > > [173774.310002] Code: c0 c3 b8 01 00 00 00 c3 66 66 2e 0f 1f 84 00 00 > > 00 00 00 66 90 0f 1f 44 00 00 f0 ff 8f a4 00 00 00 c3 0f 1f 00 0f 1f > > 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 8b 07 > > 48 89 > > [173774.310003] RSP: 0018:ffffb42e5589bde8 EFLAGS: 00010246 > > [173774.310005] RAX: ffffb42e5589be40 RBX: ffff9395fd0cd908 RCX: > > ffff9395fd0cd8f8 > > [173774.310006] RDX: ffffb42e5589be40 RSI: ffff939b59b64f18 RDI: > > ffff9395fd0cd87c > > [173774.310007] RBP: ffffffffc0930f40 R08: 0000000000140000 R09: > > ffffffffc091f100 > > [173774.310008] R10: ffff9399f69b0800 R11: 0000000000000001 R12: > > 0000000000000000 > > [173774.310009] R13: ffff9395fd0cd850 R14: 0000000000000001 R15: > > 0000000000000001 > > [173774.310010] FS: 0000000000000000(0000) GS:ffff939b7d340000(0000) > > knlGS:0000000000000000 > > [173774.310011] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [173774.310012] CR2: 00007f4f64008838 CR3: 0000000643baa000 CR4: > > 00000000003406e0 > > [173774.310013] Call Trace: > > [173774.310019] ttm_bo_cleanup_refs+0x160/0x1e0 [ttm] > > [173774.310025] ttm_bo_delayed_delete+0xa8/0x1e0 [ttm] > > [173774.310029] ttm_bo_delayed_workqueue+0x17/0x40 [ttm] > > [173774.310033] process_one_work+0x1fd/0x430 > > [173774.310036] worker_thread+0x2d/0x3d0 > > [173774.310038] ? process_one_work+0x430/0x430 > > [173774.310040] kthread+0x112/0x130 > > [173774.310042] ? kthread_create_on_node+0x60/0x60 > > [173774.310045] ret_from_fork+0x22/0x40 > > [173774.310048] Modules linked in: fuse nct6775 hwmon_vid > > nls_iso8859_1 nls_cp437 vfat fat edac_mce_amd kvm_amd kvm irqbypass > > amdgpu arc4 iwlmvm mac80211 snd_usb_audio uvcvideo snd_usbmidi_lib > > videobuf2_vmalloc crct10dif_pclmul videobuf2_memops > > snd_hda_codec_realtek videobuf2_v4l2 btusb gpu_sched snd_rawmidi > > videobuf2_common snd_hda_codec_generic btrtl videodev crc32_pclmul > > btbcm snd_seq_device ledtrig_audio ttm btintel ghash_clmulni_intel > > wmi_bmof mxm_wmi snd_hda_codec_hdmi media bluetooth drm_kms_helper > > iwlwifi snd_hda_intel drm aesni_intel snd_hda_codec joydev input_leds > > aes_x86_64 snd_hda_core mousedev evdev crypto_simd cryptd ecdh_generic > > led_class agpgart snd_hwdep mac_hid cdc_acm glue_helper ecc snd_pcm > > igb syscopyarea pcspkr cfg80211 sysfillrect snd_timer sysimgblt snd > > fb_sys_fops ccp ptp soundcore pps_core rng_core k10temp i2c_algo_bit > > sp5100_tco dca i2c_piix4 rfkill wmi pcc_cpufreq button acpi_cpufreq > > sch_fq_codel ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 > > sd_mod > > [173774.310085] hid_generic usbhid hid crc32c_intel ahci xhci_pci > > libahci xhci_hcd libata usbcore scsi_mod usb_common > > [173774.310094] ---[ end trace 1f8d21980c0b3fd5 ]--- > > [173774.310097] RIP: 0010:ttm_bo_ref_bug+0x5/0x10 [ttm] > > [173774.310099] Code: c0 c3 b8 01 00 00 00 c3 66 66 2e 0f 1f 84 00 00 > > 00 00 00 66 90 0f 1f 44 00 00 f0 ff 8f a4 00 00 00 c3 0f 1f 00 0f 1f > > 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 8b 07 > > 48 89 > > [173774.310100] RSP: 0018:ffffb42e5589bde8 EFLAGS: 00010246 > > [173774.310101] RAX: ffffb42e5589be40 RBX: ffff9395fd0cd908 RCX: > > ffff9395fd0cd8f8 > > [173774.310102] RDX: ffffb42e5589be40 RSI: ffff939b59b64f18 RDI: > > ffff9395fd0cd87c > > [173774.310103] RBP: ffffffffc0930f40 R08: 0000000000140000 R09: > > ffffffffc091f100 > > [173774.310104] R10: ffff9399f69b0800 R11: 0000000000000001 R12: > > 0000000000000000 > > [173774.310104] R13: ffff9395fd0cd850 R14: 0000000000000001 R15: > > 0000000000000001 > > [173774.310106] FS: 0000000000000000(0000) GS:ffff939b7d340000(0000) > > knlGS:0000000000000000 > > [173774.310107] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [173774.310107] CR2: 00007f4f64008838 CR3: 0000000643baa000 CR4: > > 00000000003406e0 > > [173774.310110] note: kworker/13:2[128214] exited with preempt_count 1 > > > > > > With amd-staging-drm-next: > > > > commit 20d6b9c3b7f40ec427af912d140f2be0de098d2d (origin/amd-staging-drm-next) > > Author: Gustavo A. R. Silva <gustavo@xxxxxxxxxxxxxx> > > Date: Mon Jul 22 12:47:16 2019 -0500 > > > > drm/amdkfd/kfd_mqd_manager_v10: Avoid fall-through warning > > > > with a Vega10. > > > > Is this a known issue? > > > > Thanks, > > Bas > > _______________________________________________ > > amd-gfx mailing list > > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx