On 11/13/2017 10:27 AM, Christian König wrote: > Am 13.11.2017 um 15:57 schrieb Andrey Grodzovsky: >> On 11/13/2017 07:39 AM, Christian König wrote: >> >>> Am 13.11.2017 um 12:32 schrieb Michel Dänzer: >>>> On 12/11/17 10:35 AM, Christian König wrote: >>>>> A few comments on the code: >>>>> >>>>>> +/* Validate bo size is bit bigger then the request domain */ >>>>>> +static inline bool amdgpu_bo_validate_bo_size(struct amdgpu_device >>>>>> *adev, >>>>>> + unsigned long size, u32 domain) >>>>> Drop the inline keyword and the second _bo_ in the name here. >>>>> >>>>>> +{ >>>>>> + struct ttm_mem_type_manager *man = NULL; >>>>>> + >>>>>> + if (domain & AMDGPU_GEM_DOMAIN_VRAM) { >>>>>> + man = &adev->mman.bdev.man[TTM_PL_VRAM]; >>>>>> + >>>>>> + if (man && size < (man->size << PAGE_SHIFT)) >>>>> Drop the extra check that man is not NULL. We get the pointer to an >>>>> array element, that can't be NULL. >>>>> >>>>>> + return true; >>>>> Mhm, domain is a bitmask of allowed domains. >>>>> >>>>> So we should check all valid domains if the size fit, not just the >>>>> first >>>>> one. >>>> Assuming VRAM <-> system migration of BOs larger than the GTT domain >>>> works, I'd say we should only require that the BO can fit in any of >>>> the >>>> allowed domains. Otherwise it must also always fit in GTT. >>> Good point, and yes VRAM <-> system migration of BOs larger than the >>> GTT domain works now. >>> >>> I can agree on that VRAM should probably be optional, otherwise we >>> can't allocate anything large when the driver uses only very low >>> amounts of stolen VRAM on APUs. >>> >>> But I think when userspace requests VRAM and GTT at the same time we >>> still should be able to fall back to GTT. >> >> Attached V2 patch, I still don't understand why I experience the >> SIGSEV in the tester when the check fails and the IOCTLs will return >> ENOMEM >> > > Reviewed-by: Christian König <christian.koenig at amd.com> for this one, > but please use git send-email to send out patches. > >> I will update the libdrm test to correctly handle mem failure, it >> segfaults at the moment. > > Sounds like it just tries to use the BO for VM or CPU mapping while > the underlying function has failed (or we have another bug somewhere). Yes, the segfault is because I am using gpu_mem_alloc which continues executing after amdgpu_bo_alloc failed, the segfault is in amdgpu_bo_va_op. > > Please commit the kernel patch and leave me a note so that I can push > the libdrm patches. Areyou gonna push patches 1-3 from the original series and then I need to resend patch 4 to fix the segfault ? > BTW: Do you have the link where you request an account at hand? I want > to ping the admins once more. https://bugs.freedesktop.org/show_bug.cgi?id=103566 Thanks, Andrey > > Regards, > Christian. > >> >> >> Thanks, >> Andey >>> >>> Regards, >>> Christian. >> >