On 06/25/2018 03:15 PM, James Zhu wrote: > > > > On 2018-06-25 03:02 PM, Alex Deucher wrote: >> On Mon, Jun 25, 2018 at 2:59 PM, James Zhu<jamesz at amd.com> wrote: >>> On 2018-06-25 02:53 PM, Alex Deucher wrote: >>> >>> On Mon, Jun 25, 2018 at 2:37 PM, James Zhu<jamesz at amd.com> wrote: >>> >>> For one UVD instance case,: >>> >>> >>> In function amdgpu_driver_load_kms, all ring->me should be set to zero. >>> adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL); >>> >>> >>> For two UVD instances cases: >>> >>> static void uvd_v7_0_set_ring_funcs(struct amdgpu_device *adev) >>> .. >>> for (i = 0; i < adev->uvd.num_uvd_inst; i++) { >>> adev->uvd.inst[i].ring.me = i; >>> >>> static void uvd_v7_0_set_enc_ring_funcs(struct amdgpu_device *adev) >>> >>> for (j = 0; j < adev->uvd.num_uvd_inst; j++) { >>> adev->uvd.inst[j].ring_enc[i].me = j; >>> >>> uvd_v4_2_early_init in uvd_v4_2.c adev->uvd.num_uvd_inst = 1; >>> uvd_v5_0_early_init in uvd_v5_0.c adev->uvd.num_uvd_inst = 1; >>> uvd_v6_0_early_init in uvd_v6_0.c adev->uvd.num_uvd_inst = 1; >>> uvd_v7_0_early_init in uvd_v7_0.c >>> if (adev->asic_type == CHIP_VEGA20) >>> adev->uvd.num_uvd_inst = UVD7_MAX_HW_INSTANCES_VEGA20;/*2*/ >>> else >>> adev->uvd.num_uvd_inst = 1; >>> >>> >>> I didn't know when ring->me is set to 2. Maybe there is some leakage >>> somewhere. >>> >>> What about older uvd (4.2, 5.0, 6.0) blocks? >>> >>> I think the below code will reset >>> adev->uvd.inst[AMDGPU_MAX_UVD_INSTANCES].ring->me and >>> adev->uvd.inst[AMDGPU_MAX_UVD_INSTANCES].ring_enc[AMDGPU_MAX_UVD_ENC_RINGS]->me >>> to zero. >>> for older uvd IP UVD block. >>> >>> adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL); >>> >>> Do I understand correctly? >> Yes, it should. That's why it doesn't make sense that it would be >> getting another value. >> >> Alex >    ring point is refer in struct amdgpu_device. Miss using it may > cause simple leakage. >   struct amdgpu_device { >   .............. >     struct amdgpu_ring      **/rings/*[AMDGPU_MAX_RINGS]; > > The simple possible leakage could happen at > ./gfx_v8_0.c:1995:   ring->me = mec + 1; > ./amdgpu_gfx.c:190:      ring->me = mec + 1; > ./gfx_v9_0.c:1406:   ring->me = mec + 1; > ./gfx_v7_0.c:4471:   ring->me = mec + 1; > > Timothy, > > It is not easy to find root cause based on current information. > What asic are you using on this test. IS this UBSAN test open source? > Is it easy for you guide me to reproduce it on my bench? https://people.freedesktop.org/~narmstrong/meson_drm_doc/dev-tools/ubsan.html Leo > > James >>> James >>> >>> Alex >>> >>> Best regards! >>> >>> James zhu >>> >>> >>> On 2018-06-25 01:29 PM, Deucher, Alexander wrote: >>> >>> Odd. The structure should be 0 initialized. Does this patch help? >>> >>> >>> Alex >>> >>> ________________________________ >>> From: Timothy Pearson<tpearson at raptorengineering.com> >>> Sent: Monday, June 25, 2018 11:53:12 AM >>> To: Zhu, James >>> Cc:amd-gfx at lists.freedesktop.org; Deucher, Alexander; Zhou, >>> David(ChunMing); Koenig, Christian >>> Subject: Re: [PATCH] Increase AMDGPU_MAX_UVD_INSTANCES to 3 >>> >>> n 06/25/2018 09:46 AM, James Zhu wrote: >>> >>> On 2018-06-23 08:02 PM, Timothy Pearson wrote: >>> >>> amdgpu_fence_driver_start_ring() attempts to access >>> UVD instance 2 during setup, while the existing UVD >>> instance count only allows instances 0 and 1. >>> >>> Increase AMDGPU_MAX_UVD_INSTANCES by one to avoid the >>> invalid array access. >>> >>> Caught by UBSAN. >>> >>> Hi Timothy, >>> >>> From design of view, it is not right to just change >>> AMDGPU_MAX_UVD_INSTANCES to 3. >>> >>> Could you tell me some detail of UBSAN test and attach the dmesg also? >>> >>> Definitely, was looking for some feedback from anyone knowing more about >>> the internals of the UVD system. >>> >>> What's happening is that "ring->me" in amdgpu_fence_driver_start_ring() >>> (drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:379) is set to a value of >>> "2". The overall dmesg is otherwise uninteresting, but I can try to >>> grab the UBSAN output if needed. >>> >>> -- >>> Timothy Pearson >>> Raptor Engineering >>> +1 (415) 727-8645 (direct line) >>> +1 (512) 690-0200 (switchboard) >>> https://www.raptorengineering.com >>> >>> >>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>> >>> > > > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180625/44e97042/attachment-0001.html>