[PATCH libdrm 0/4] Dynamicly disable suites and tets.

Andrey.Grodzovsky@xxxxxxx (Andrey Grodzovsky) · Mon, 13 Nov 2017 10:36:53 -0500

On 11/13/2017 10:27 AM, Christian KÃ¶nig wrote:
> Am 13.11.2017 um 15:57 schrieb Andrey Grodzovsky:
>> On 11/13/2017 07:39 AM, Christian KÃ¶nig wrote:
>>
>>> Am 13.11.2017 um 12:32 schrieb Michel DÃ¤nzer:
>>>> On 12/11/17 10:35 AM, Christian KÃ¶nig wrote:
>>>>> A few comments on the code:
>>>>>
>>>>>> +/* Validate bo size is bit bigger then the request domain */
>>>>>> +static inline bool amdgpu_bo_validate_bo_size(struct amdgpu_device
>>>>>> *adev,
>>>>>> +                      unsigned long size, u32 domain)
>>>>> Drop the inline keyword and the second _bo_ in the name here.
>>>>>
>>>>>> +{
>>>>>> +    struct ttm_mem_type_manager *man = NULL;
>>>>>> +
>>>>>> +    if (domain & AMDGPU_GEM_DOMAIN_VRAM) {
>>>>>> +        man = &adev->mman.bdev.man[TTM_PL_VRAM];
>>>>>> +
>>>>>> +        if (man && size < (man->size << PAGE_SHIFT))
>>>>> Drop the extra check that man is not NULL. We get the pointer to an
>>>>> array element, that can't be NULL.
>>>>>
>>>>>> +            return true;
>>>>> Mhm, domain is a bitmask of allowed domains.
>>>>>
>>>>> So we should check all valid domains if the size fit, not just the 
>>>>> first
>>>>> one.
>>>> Assuming VRAM <-> system migration of BOs larger than the GTT domain
>>>> works, I'd say we should only require that the BO can fit in any of 
>>>> the
>>>> allowed domains. Otherwise it must also always fit in GTT.
>>> Good point, and yes VRAM <-> system migration of BOs larger than the 
>>> GTT domain works now.
>>>
>>> I can agree on that VRAM should probably be optional, otherwise we 
>>> can't allocate anything large when the driver uses only very low 
>>> amounts of stolen VRAM on APUs.
>>>
>>> But I think when userspace requests VRAM and GTT at the same time we 
>>> still should be able to fall back to GTT.
>>
>> Attached V2 patch, I still don't understand why I experience the 
>> SIGSEV in the tester when the check fails and the IOCTLs will return 
>> ENOMEM
>>
>
> Reviewed-by: Christian KÃ¶nig <christian.koenig at amd.com> for this one, 
> but please use git send-email to send out patches.
>
>> I will update the libdrm test to correctly handle mem failure, it 
>> segfaults at the moment.
>
> Sounds like it just tries to use the BO for VM or CPU mapping while 
> the underlying function has failed (or we have another bug somewhere).

Yes, the segfault is because I am using gpu_mem_alloc which continues 
executing after amdgpu_bo_alloc failed, the segfault is in amdgpu_bo_va_op.

>
> Please commit the kernel patch and leave me a note so that I can push 
> the libdrm patches. 

Areyou gonna push patches 1-3 from the original series and then I need 
to resend patch 4 to fix the segfault ?

> BTW: Do you have the link where you request an account at hand? I want 
> to ping the admins once more.

https://bugs.freedesktop.org/show_bug.cgi?id=103566

Thanks,
Andrey

>
> Regards,
> Christian.
>
>>
>>
>> Thanks,
>> Andey
>>>
>>> Regards,
>>> Christian.
>>
>