On 2018-06-25 03:35 PM, Leo Liu wrote: > > > > On 06/25/2018 03:15 PM, James Zhu wrote: >> >> >> >> On 2018-06-25 03:02 PM, Alex Deucher wrote: >>> On Mon, Jun 25, 2018 at 2:59 PM, James Zhu<jamesz at amd.com> wrote: >>>> On 2018-06-25 02:53 PM, Alex Deucher wrote: >>>> >>>> On Mon, Jun 25, 2018 at 2:37 PM, James Zhu<jamesz at amd.com> wrote: >>>> >>>> For one UVD instance case,: >>>> >>>> >>>> In function amdgpu_driver_load_kms, all ring->me should be set to zero. >>>> adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL); >>>> >>>> >>>> For two UVD instances cases: >>>> >>>> static void uvd_v7_0_set_ring_funcs(struct amdgpu_device *adev) >>>> .. >>>> for (i = 0; i < adev->uvd.num_uvd_inst; i++) { >>>> adev->uvd.inst[i].ring.me = i; >>>> >>>> static void uvd_v7_0_set_enc_ring_funcs(struct amdgpu_device *adev) >>>> >>>> for (j = 0; j < adev->uvd.num_uvd_inst; j++) { >>>> adev->uvd.inst[j].ring_enc[i].me = j; >>>> >>>> uvd_v4_2_early_init in uvd_v4_2.c adev->uvd.num_uvd_inst = 1; >>>> uvd_v5_0_early_init in uvd_v5_0.c adev->uvd.num_uvd_inst = 1; >>>> uvd_v6_0_early_init in uvd_v6_0.c adev->uvd.num_uvd_inst = 1; >>>> uvd_v7_0_early_init in uvd_v7_0.c >>>> if (adev->asic_type == CHIP_VEGA20) >>>> adev->uvd.num_uvd_inst = UVD7_MAX_HW_INSTANCES_VEGA20;/*2*/ >>>> else >>>> adev->uvd.num_uvd_inst = 1; >>>> >>>> >>>> I didn't know when ring->me is set to 2. Maybe there is some leakage >>>> somewhere. >>>> >>>> What about older uvd (4.2, 5.0, 6.0) blocks? >>>> >>>> I think the below code will reset >>>> adev->uvd.inst[AMDGPU_MAX_UVD_INSTANCES].ring->me and >>>> adev->uvd.inst[AMDGPU_MAX_UVD_INSTANCES].ring_enc[AMDGPU_MAX_UVD_ENC_RINGS]->me >>>> to zero. >>>> for older uvd IP UVD block. >>>> >>>> adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL); >>>> >>>> Do I understand correctly? >>> Yes, it should. That's why it doesn't make sense that it would be >>> getting another value. >>> >>> Alex >>    ring point is refer in struct amdgpu_device. Miss using it may >> cause simple leakage. >>   struct amdgpu_device { >>   .............. >>     struct amdgpu_ring      **/rings/*[AMDGPU_MAX_RINGS]; >> >> The simple possible leakage could happen at >> ./gfx_v8_0.c:1995:   ring->me = mec + 1; >> ./amdgpu_gfx.c:190:      ring->me = mec + 1; >> ./gfx_v9_0.c:1406:   ring->me = mec + 1; >> ./gfx_v7_0.c:4471:   ring->me = mec + 1; >> >> Timothy, >> >> It is not easy to find root cause based on current information. >> What asic are you using on this test. IS this UBSAN test open source? >> Is it easy for you guide me to reproduce it on my bench? > https://people.freedesktop.org/~narmstrong/meson_drm_doc/dev-tools/ubsan.html > > Leo > Leo, Thanks! I find the problem.. James > >> James >>>> James >>>> >>>> Alex >>>> >>>> Best regards! >>>> >>>> James zhu >>>> >>>> >>>> On 2018-06-25 01:29 PM, Deucher, Alexander wrote: >>>> >>>> Odd. The structure should be 0 initialized. Does this patch help? >>>> >>>> >>>> Alex >>>> >>>> ________________________________ >>>> From: Timothy Pearson<tpearson at raptorengineering.com> >>>> Sent: Monday, June 25, 2018 11:53:12 AM >>>> To: Zhu, James >>>> Cc:amd-gfx at lists.freedesktop.org; Deucher, Alexander; Zhou, >>>> David(ChunMing); Koenig, Christian >>>> Subject: Re: [PATCH] Increase AMDGPU_MAX_UVD_INSTANCES to 3 >>>> >>>> n 06/25/2018 09:46 AM, James Zhu wrote: >>>> >>>> On 2018-06-23 08:02 PM, Timothy Pearson wrote: >>>> >>>> amdgpu_fence_driver_start_ring() attempts to access >>>> UVD instance 2 during setup, while the existing UVD >>>> instance count only allows instances 0 and 1. >>>> >>>> Increase AMDGPU_MAX_UVD_INSTANCES by one to avoid the >>>> invalid array access. >>>> >>>> Caught by UBSAN. >>>> >>>> Hi Timothy, >>>> >>>> From design of view, it is not right to just change >>>> AMDGPU_MAX_UVD_INSTANCES to 3. >>>> >>>> Could you tell me some detail of UBSAN test and attach the dmesg also? >>>> >>>> Definitely, was looking for some feedback from anyone knowing more about >>>> the internals of the UVD system. >>>> >>>> What's happening is that "ring->me" in amdgpu_fence_driver_start_ring() >>>> (drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:379) is set to a value of >>>> "2". The overall dmesg is otherwise uninteresting, but I can try to >>>> grab the UBSAN output if needed. >>>> >>>> -- >>>> Timothy Pearson >>>> Raptor Engineering >>>> +1 (415) 727-8645 (direct line) >>>> +1 (512) 690-0200 (switchboard) >>>> https://www.raptorengineering.com >>>> >>>> >>>> >>>> _______________________________________________ >>>> amd-gfx mailing list >>>> amd-gfx at lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>> >>>> >> >> >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180625/1a927474/attachment.html>