Re: [PATCH] drm/radeon: fix TOPDOWN handling for bo_create

Alex Deucher <alexdeucher@xxxxxxxxx> · Tue, 17 Mar 2015 09:49:31 -0400

On Tue, Mar 17, 2015 at 7:40 AM, Christian König
<deathsimple@xxxxxxxxxxx> wrote:
> On 16.03.2015 23:32, Alex Deucher wrote:
>>
>> On Thu, Mar 12, 2015 at 10:55 PM, Michel Dänzer <michel@xxxxxxxxxxx>
>> wrote:
>>>
>>> On 12.03.2015 22:09, Alex Deucher wrote:
>>>>
>>>> On Thu, Mar 12, 2015 at 5:23 AM, Christian König
>>>> <deathsimple@xxxxxxxxxxx> wrote:
>>>>>
>>>>> On 12.03.2015 10:02, Michel Dänzer wrote:
>>>>>>
>>>>>> On 12.03.2015 06:14, Alex Deucher wrote:
>>>>>>>
>>>>>>> On Wed, Mar 11, 2015 at 4:51 PM, Alex Deucher <alexdeucher@xxxxxxxxx>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Wed, Mar 11, 2015 at 2:21 PM, Christian König
>>>>>>>> <deathsimple@xxxxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>> On 11.03.2015 16:44, Alex Deucher wrote:
>>>>>>>>>>
>>>>>>>>>> radeon_bo_create() calls radeon_ttm_placement_from_domain()
>>>>>>>>>> before ttm_bo_init() is called.
>>>>>>>>>> radeon_ttm_placement_from_domain()
>>>>>>>>>> uses the ttm bo size to determine when to select top down
>>>>>>>>>> allocation but since the ttm bo is not initialized yet the
>>>>>>>>>> check is always false.
>>>>>>>>>>
>>>>>>>>>> Noticed-by: Oded Gabbay <oded.gabbay@xxxxxxx>
>>>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
>>>>>>>>>> Cc: stable@xxxxxxxxxxxxxxx
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> And I was already wondering why the heck the BOs always made this
>>>>>>>>> ping/pong
>>>>>>>>> in memory after creation.
>>>>>>>>>
>>>>>>>>> Patch is Reviewed-by: Christian König <christian.koenig@xxxxxxx>
>>>>>>>>
>>>>>>>> And fixing that promptly broke VCE due to vram location
>>>>>>>> requirements.
>>>>>>>> Updated patch attached.  Thoughts?
>>>>>>>
>>>>>>> And one more take to make things a bit more explicit for static
>>>>>>> kernel
>>>>>>> driver allocations.
>>>>>>
>>>>>> struct ttm_place::lpfn is honoured even with TTM_PL_FLAG_TOPDOWN, so
>>>>>> latter should work with RADEON_GEM_CPU_ACCESS. It sounds like the
>>>>>> problem is really that some BOs are expected to be within a certain
>>>>>> range from the beginning of VRAM, but lpfn isn't set accordingly. It
>>>>>> would be better to fix that by setting lpfn directly than indirectly
>>>>>> via
>>>>>> RADEON_GEM_CPU_ACCESS.
>>>>>
>>>>>
>>>>> Yeah, agree. We should probably try to find the root cause of this
>>>>> instead.
>>>>>
>>>>> As far as I know VCE has no documented limitation on where buffers are
>>>>> placed (unlike UVD). So this is a bit strange. Are you sure that it
>>>>> isn't
>>>>> UVD which breaks here?
>>>>
>>>> It's definitely VCE, I don't know why UVD didn't have a problem.  I
>>>> considered using pin_restricted to make sure it got pinned in the CPU
>>>> visible region, but that had two problems: 1. it would end up getting
>>>> migrated when pinned,
>>>
>>> Maybe something like radeon_uvd_force_into_uvd_segment() is needed for
>>> VCE as well?
>>>
>>>
>>>> 2. it would end up at the top of the restricted
>>>> region since the top down flag is set which would end up fragmenting
>>>> vram.
>>>
>>> If that's an issue (which outweighs the supposed benefit of
>>> TTM_PL_FLAG_TOPDOWN), then again the proper solution would be not to set
>>> TTM_PL_FLAG_TOPDOWN when rbo->placements[i].lpfn != 0 and smaller than
>>> the whole available region, instead of checking for VRAM and
>>> RADEON_GEM_CPU_ACCESS.
>>>
>> How about something like the attached patch?  I'm not really sure
>> about the restrictions for the UVD and VCE fw and stack/heap buffers,
>> but this seems to work.  It seems like the current UVD/VCE code works
>> by accident since the check for TOPDOWN fails.
>
>
> Well, I would still like to rather find the bug in the VCE code cause there
> shouldn't be a hardware limitation like this and that really hurt us already
> with UVD.
>
> What system do you use to test this? How much address space do you have for
> VRAM/GART? Is it possible that we get over a 4GB boundary with that?

I was testing on a KV with 15GB of ram, 1 GB of vram carve out, 1 GB
of gart.  If I just apply the original patch to fix TOPDOWN, VCE
breaks.  ecpu failures.  If I limit the vce fw/stack/heap buffer to
the first 256 MB, all is well.  UVD seems to work fine either way.

Alex

>
> Regards,
> Christian.
>
>>
>> Alex
>
>
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel