On Tue, Jun 26, 2018 at 8:29 AM, Leo Liu <leo.liu at amd.com> wrote: > > > On 06/25/2018 04:42 PM, Alex Deucher wrote: >> >> On Mon, Jun 25, 2018 at 4:37 PM, James Zhu <jamesz at amd.com> wrote: >>> >>> >>> On 2018-06-25 04:32 PM, Alex Deucher wrote: >>>> >>>> On Mon, Jun 25, 2018 at 4:26 PM, James Zhu <jamesz at amd.com> wrote: >>>>> >>>>> >>>>> On 2018-06-25 04:02 PM, Alex Deucher wrote: >>>>>> >>>>>> On Mon, Jun 25, 2018 at 3:17 PM, Leo Liu <leo.liu at amd.com> wrote: >>>>>>> >>>>>>> [ 3.866656] index 2 is out of range for type 'amdgpu_uvd_inst [2]' >>>>>>> [ 3.866667] CPU: 0 PID: 59 Comm: kworker/0:1 Not tainted >>>>>>> 4.16.0-rc7+ >>>>>>> #3 >>>>>>> [ 3.866677] Hardware name: Gigabyte Technology Co., Ltd. >>>>>>> GA-990FXA-UD7/GA-990FXA-UD7, BIOS F9 06/08/2012 >>>>>>> [ 3.866693] Workqueue: events work_for_cpu_fn >>>>>>> [ 3.866702] Call Trace: >>>>>>> [ 3.866710] dump_stack+0x85/0xc5 >>>>>>> [ 3.866719] ubsan_epilogue+0x9/0x40 >>>>>>> [ 3.866727] __ubsan_handle_out_of_bounds+0x89/0x90 >>>>>>> [ 3.866737] ? rcu_read_lock_sched_held+0x58/0x60 >>>>>>> [ 3.866746] ? __kmalloc+0x26c/0x2d0 >>>>>>> [ 3.866846] amdgpu_fence_driver_start_ring+0x259/0x280 [amdgpu] >>>>>>> [ 3.866896] amdgpu_ring_init+0x12c/0x710 [amdgpu] >>>>>>> [ 3.866906] ? sprintf+0x42/0x50 >>>>>>> [ 3.866956] amdgpu_gfx_kiq_init_ring+0x1bc/0x3a0 [amdgpu] >>>>>>> [ 3.867009] gfx_v8_0_sw_init+0x1ad3/0x2360 [amdgpu] >>>>>>> [ 3.867062] ? smu7_init+0xec/0x160 [amdgpu] >>>>>>> [ 3.867109] amdgpu_device_init+0x112c/0x1dc0 [amdgpu] >>>>>>> >>>>>>> ring->me might be set as 2 with amdgpu_gfx_kiq_init_ring. >>>>>>> >>>>>>> Signed-off-by: Leo Liu <leo.liu at amd.com> >>>>>>> --- >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 11 ++++++++++- >>>>>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c >>>>>>> index 39ec6b8890a1..6067dae1552f 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c >>>>>>> @@ -375,8 +375,17 @@ int amdgpu_fence_driver_start_ring(struct >>>>>>> amdgpu_ring *ring, >>>>>>> { >>>>>>> struct amdgpu_device *adev = ring->adev; >>>>>>> uint64_t index; >>>>>>> + bool uvd_ring = false; >>>>>>> + int i; >>>>>>> + >>>>>>> + for (i = 0; i < AMDGPU_MAX_UVD_INSTANCES; ++i) { >>>>> >>>>> >>>>>>> + if (ring == &adev->uvd.inst[i].ring) { >>>>>>> + uvd_ring = true; >>>>>>> + break; >>>>>>> + } >>>>>>> + } >>>>>>> >>>>>>> - if (ring != &adev->uvd.inst[ring->me].ring) { >>>>>>> + if (!uvd_ring) { >>>>>> >>>>>> I think we can just simplify this to: >>>>>> if (ring->funcs->type != AMDGPU_RING_TYPE_UVD) >>>>>> >>>>>> Alex >>>>> >>>>> Need add UVD_ENC also: >>>>> if ( ring->funcs->type != AMDGPU_RING_TYPE_UVD && >>>>> ring->funcs->type != AMDGPU_RING_TYPE_UVD_ENC ) { >>>>> >>>> The old code never checked against the enc rings. Are you sure they >>>> have the same limitation? >>> >>> I am wrong. thanks! James >> >> Actually does the uvd decode ring still have this limitation or is >> this just some leftover from a hw limitation on an older asic? > > The code here inherited from old ASICs, and not be revisited since there is > no problem in functional wise. > > >> If >> it's still required, the current code in >> amdgpu_fence_driver_start_ring() won't work for multiple instances. >> The fences for the rings will overwrite each other. > > IIRC Vega20 with 2 instances have separated runtime environment, that should > have fence separated as well. Ok. For some reason I was thinking the instances shared the same copy of the firmware. Anyway, the patch as is is: Reviewed-by: Alex Deucher <alexander.deucher at amd.com> We can optimize for newer asics as a later improvement. Alex