Hi Jerry, Just got up and going (6am ... ugh early). I see the confusion. Yes there is a patch on drm-next but the problem is there is a table for both decode and encode. That patch that is already on drm-next only adds the callback for encode. My patch adds the callback for decode as well. :-) Cheers, Tom On 05/01/2018 09:44 PM, Zhang, Jerry (Junwei) wrote: > Hi Tom, > > Ha, got your meaning. > Please check it with the latest drm-next from gerrit tomorrow. > > Jerry > > On 05/02/2018 09:41 AM, StDenis, Tom wrote: >> Hi Jerry, >> >> Like I said it's (now well) past EOD (meaning my workstation is >> powered off) so I'll have to check tomorrow. But I do pull from >> gerrit daily and build from that. >> >> I'll take a look in the morning. >> >> Cheers, >> Tom >> ________________________________________ >> From: Zhang, Jerry >> Sent: Tuesday, May 1, 2018 21:39 >> To: StDenis, Tom; Deucher, Alexander >> Cc: Koenig, Christian; amd-gfx at lists.freedesktop.org >> Subject: Re: vcn regression on raven1 >> >> Hi Tom, >> >> Do you mean you cannot find the patch from >> gerrit/amd-staging-dkms-next either? >> >> I do find it. >> >> the tip of gerrit/amd-staging-drm-next is >>    * bb54e82 2018-04-30 12:17:07 -0400 drm/amdgpu: Switch to >> interruptable wait >> to recover from ring hang. <Andrey Grodzovsky> >> >> while the tip of freedesktop is >>    * a11008c 2018-04-25 20:32:05 -0500 drm/powerplay: Add powertune >> table for >> VEGAM <Eric Huang> >> >> Jerry >> >> On 05/02/2018 09:29 AM, StDenis, Tom wrote: >>> I pull from gerrit. I'm just pointing out that it's not on drm-next >>> upstream either. >>> >>> It may have been missed in a rebase or something. >>> >>> Tom >>> ________________________________________ >>> From: Zhang, Jerry >>> Sent: Tuesday, May 1, 2018 21:07 >>> To: StDenis, Tom; Deucher, Alexander >>> Cc: Koenig, Christian; amd-gfx at lists.freedesktop.org >>> Subject: Re: vcn regression on raven1 >>> >>> Hi Tom, >>> >>> Sound you get the code from freedesktop rather than the internal >>> drm-next. >>> Unfortunately freedesktop looks delay to sync the code from internal >>> drm-next. >>> That's the gap it happened as issue in the test. >>> >>> Hi Alex, >>> >>> Is that a issue for code syncing between freedesktop and internal >>> drm-next? >>> Or it's a known issue of delay syncing code. >>> >>> Jerry >>> >>> On 05/02/2018 08:57 AM, StDenis, Tom wrote: >>>> Hi Jerry, >>>> >>>> It's well past EOD for me I'll pick this up in the morning. >>>> >>>> I'm fairly certain I wrote my patch against the tip of >>>> amd-staging-drm-next as of my pull this morning though. >>>> >>>> If it's in there and I missed it somehow I apologize otherwise it'd >>>> be nice to make sure it's in there. >>>> >>>> Based on the public copy of the tree it's not there >>>> >>>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c?h=amd-staging-drm-next#n1110 >>>> >>>> >>>> Cheers, >>>> Tom >>>> ________________________________________ >>>> From: Zhang, Jerry >>>> Sent: Tuesday, May 1, 2018 20:52 >>>> To: StDenis, Tom; Deucher, Alexander >>>> Cc: Koenig, Christian; amd-gfx at lists.freedesktop.org >>>> Subject: Re: vcn regression on raven1 >>>> >>>> Hi Tom, >>>> >>>> It was landed in the latest drm-next, like >>>>      * 964933a 2018-04-27 10:26:09 +0800 drm/amdgpu/uvd7: add >>>> emit_reg_write_reg_wait ring callback <Xiaojie Yuan> >>>> >>>> Did you test with that included? >>>> Please try to get the latest drm-next, if not. >>>> They look the same issue from the log. >>>> >>>> Jerry >>>> >>>> On 05/02/2018 08:47 AM, StDenis, Tom wrote: >>>>> Hi Jerry, >>>>> >>>>> So far as I know this wasn't included on the tip of drm-next. I >>>>> hit this this morning in my semi-regular pull/build/test cycle. >>>>> >>>>> Was this missed in a recent rebase? >>>>> >>>>> Tom >>>>> ________________________________________ >>>>> From: Zhang, Jerry >>>>> Sent: Tuesday, May 1, 2018 20:43 >>>>> To: StDenis, Tom; Deucher, Alexander >>>>> Cc: Koenig, Christian; amd-gfx at lists.freedesktop.org >>>>> Subject: Re: vcn regression on raven1 >>>>> >>>>> On 05/01/2018 09:34 PM, Tom St Denis wrote: >>>>>> Hi all, >>>>>> >>>>>> I've noticed that on the tip of drm-next vcn playback of video is >>>>>> broken (see >>>>>> dmesg below). I've bisected it to this commit >>>>> >>>>> It may be fixed here as a common issue. >>>>> >>>>>       * https://patchwork.freedesktop.org/patch/218909/ >>>>> >>>>> Jerry >>>>> >>>>>> >>>>>> [root at raven linux]# git bisect good >>>>>> 701372349fd55b5396b335580e979ac4dde3dd02 is the first bad commit >>>>>> commit 701372349fd55b5396b335580e979ac4dde3dd02 >>>>>> Author: Alex Deucher <alexander.deucher at amd.com> >>>>>> Date:  Tue Mar 27 17:10:56 2018 -0500 >>>>>> >>>>>>         drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait >>>>>> in gpu tlb flush >>>>>> >>>>>>         Use amdgpu_ring_emit_reg_write_reg_wait. On engines that >>>>>> support it, >>>>>>         it provides a write and wait in a single packet which >>>>>> avoids a missed >>>>>>         ack if a world switch happens between the request and >>>>>> waiting for the >>>>>>         ack. >>>>>> >>>>>>         Reviewed-by: Huang Rui <ray.huang at amd.com> >>>>>>         Reviewed-by: Christian König <christian.koenig at amd.com> >>>>>>         Signed-off-by: Alex Deucher <alexander.deucher at amd.com> >>>>>> >>>>>> :040000 040000 4e4312de03f4b34abd65f4bb12dba4c7093055ba >>>>>> ccc4abc78c0b6f24328fd998f998fa06bf0618b1 M     drivers >>>>>> >>>>>> Which is odd because the commit before this is the vcn change and >>>>>> it works fine >>>>>> (playing BBB right now). >>>>>> >>>>>> Here's the dmesg: >>>>>> >>>>>> [ 2925.640102] BUG: unable to handle kernel NULL pointer >>>>>> dereference at >>>>>> 0000000000000000 >>>>>> [ 2925.640113] IP:          (null) >>>>>> [ 2925.640116] PGD 0 P4D 0 >>>>>> [ 2925.640122] Oops: 0010 [#1] SMP KASAN NOPTI >>>>>> [ 2925.640126] Modules linked in: tun fuse amdkfd amdgpu mfd_core >>>>>> chash >>>>>> gpu_sched ttm ax88179_178a usbnet >>>>>> [ 2925.640139] CPU: 4 PID: 3791 Comm: vcn_dec Not tainted >>>>>> 4.16.0-rc7+ #20 >>>>>> [ 2925.640142] Hardware name: System manufacturer System Product >>>>>> Name/TUF >>>>>> B350M-PLUS GAMING, BIOS 3803 01/22/2018 >>>>>> [ 2925.640146] RIP: 0010:         (null) >>>>>> [ 2925.640148] RSP: 0018:ffff8801d54f7790 EFLAGS: 00010206 >>>>>> [ 2925.640153] RAX: 0000000000000000 RBX: ffff8801d8b38420 RCX: >>>>>> 00000000007c0080 >>>>>> [ 2925.640156] RDX: 000000000001a6fa RSI: 000000000001a6e8 RDI: >>>>>> ffff8801d8b38420 >>>>>> [ 2925.640159] RBP: 000000000001a6fa R08: 0000000000000080 R09: >>>>>> ffffed003aa9eef9 >>>>>> [ 2925.640162] R10: 0000000009c74f08 R11: fffffbfff0f5d1e7 R12: >>>>>> ffff8801d8b3277c >>>>>> [ 2925.640164] R13: ffff8801d8b3001c R14: 0000000000000005 R15: >>>>>> 0000000000000000 >>>>>> [ 2925.640168] FS: 0000000000000000(0000) GS:ffff8801dcf00000(0000) >>>>>> knlGS:0000000000000000 >>>>>> [ 2925.640171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>> [ 2925.640174] CR2: 0000000000000000 CR3: 00000001d9712000 CR4: >>>>>> 00000000003406e0 >>>>>> [ 2925.640176] Call Trace: >>>>>> [ 2925.640272] ? gmc_v9_0_emit_flush_gpu_tlb+0x260/0x2a0 [amdgpu] >>>>>> [ 2925.640368] ? vcn_v1_0_dec_ring_insert_start+0x360/0x360 [amdgpu] >>>>>> [ 2925.640459] ? mmhub_v1_0_get_clockgating+0xc0/0xc0 [amdgpu] >>>>>> [ 2925.640545] ? amdgpu_vmid_had_gpu_reset+0x89/0xc0 [amdgpu] >>>>>> [ 2925.640640] ? vcn_v1_0_dec_ring_emit_vm_flush+0x64/0xb0 [amdgpu] >>>>>> [ 2925.640725] ? amdgpu_vm_flush+0xb43/0xcc0 [amdgpu] >>>>>> [ 2925.640810] ? amdgpu_vm_need_pipeline_sync+0x260/0x260 [amdgpu] >>>>>> [ 2925.640897] ? amdgpu_vmid_had_gpu_reset+0xc0/0xc0 [amdgpu] >>>>>> [ 2925.641003] ? vcn_v1_0_dec_ring_insert_start+0x2d7/0x360 [amdgpu] >>>>>> [ 2925.641095] ? amdgpu_ib_schedule+0x1b5/0x800 [amdgpu] >>>>>> [ 2925.641102] ? dma_fence_add_callback+0x15f/0x360 >>>>>> [ 2925.641201] ? amdgpu_job_run+0x32f/0x370 [amdgpu] >>>>>> [ 2925.641297] ? amdgpu_job_free_resources+0xd0/0xd0 [amdgpu] >>>>>> [ 2925.641302] ? __queue_delayed_work+0x144/0x1d0 >>>>>> [ 2925.641306] ? delayed_work_timer_fn+0x40/0x40 >>>>>> [ 2925.641312] ? prepare_to_wait_exclusive+0x1d0/0x1d0 >>>>>> [ 2925.641318] ? drm_sched_main+0x68c/0x940 [gpu_sched] >>>>>> [ 2925.641323] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched] >>>>>> [ 2925.641328] ? save_stack+0x89/0xb0 >>>>>> [ 2925.641332] ? wait_woken+0x110/0x110 >>>>>> [ 2925.641337] ? ret_from_fork+0x22/0x40 >>>>>> [ 2925.641343] ? __schedule+0xd30/0xd30 >>>>>> [ 2925.641346] ? remove_wait_queue+0x150/0x150 >>>>>> [ 2925.641353] ? rcu_note_context_switch+0x2a0/0x2a0 >>>>>> [ 2925.641359] ? __lock_text_start+0x8/0x8 >>>>>> [ 2925.641367] ? drm_sched_entity_fini+0x60/0x60 [gpu_sched] >>>>>> [ 2925.641371] ? kthread+0x19b/0x1c0 >>>>>> [ 2925.641376] ? kthread_create_worker_on_cpu+0xc0/0xc0 >>>>>> [ 2925.641382] ? ret_from_fork+0x22/0x40 >>>>>> [ 2925.641387] Code: Bad RIP value. >>>>>> [ 2925.641397] RIP:          (null) RSP: ffff8801d54f7790 >>>>>> [ 2925.641400] CR2: 0000000000000000 >>>>>> [ 2925.641405] ---[ end trace 0684cc0468f60fb1 ]--- >>>>>> >>>>>> >>>>>> Note that regular compute/gfx workflows work fine on the tip of >>>>>> drm-next only >>>>>> vcn playback triggeers this (haven't tried encode yet...). >>>>>> >>>>>> Cheers, >>>>>> Tom >>>>>> _______________________________________________ >>>>>> amd-gfx mailing list >>>>>> amd-gfx at lists.freedesktop.org >>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx