On 2018-09-17 1:55 p.m., Christian König wrote: > Am 17.09.2018 um 19:50 schrieb Tom St Denis: >> On 2018-09-17 1:45 p.m., Christian König wrote: >>> Mhm, not the slightest idea. >>> >>> That nearly looks like adev->stolen_vga_memory already contains >>> something. >> >> Nope, >> >> [  51.564605] >>>adev->stolen_vga_memory == (null) >> [  51.564619] kasan: CONFIG_KASAN_INLINE enabled >> [  51.564877] kasan: GPF could be caused by NULL-ptr deref or user >> memory access >> [  51.565071] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC >> KASAN NOPTI >> [  51.565254] CPU: 6 PID: 3863 Comm: modprobe Not tainted 4.19.0-rc1+ >> #30 >> [  51.565425] Hardware name: System manufacturer System Product >> Name/TUF B350M-PLUS GAMING, BIOS 4011 04/19/2018 >> [  51.565714] RIP: 0010:amdgpu_bo_create_kernel+0x59/0x1a0 [amdgpu] >> >> That's me printing out the value of the value for stolen_vga_memory >> before the call to allocate it. > > What does amdgpu_bo_create_kernel+0x59 points to? I've never really got line numbers to work with the kernel but if I had to guess I'd say right here int amdgpu_bo_create_kernel(struct amdgpu_device *adev, unsigned long size, int align, u32 domain, struct amdgpu_bo **bo_ptr, u64 *gpu_addr, void **cpu_addr) { int r; r = amdgpu_bo_create_reserved(adev, size, align, domain, bo_ptr, gpu_addr, cpu_addr); if (r) return r; *bo_ptr is NULL ===> amdgpu_bo_unreserve(*bo_ptr); return 0; } Which then results in static inline void amdgpu_bo_unreserve(struct amdgpu_bo *bo) { ttm_bo_unreserve(&bo->tbo); } Which then passes the address NULL + offsetof(tbo) to ttm_bo_unreserve: static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo) { if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { spin_lock(&bo->bdev->glob->lru_lock); ttm_bo_add_to_lru(bo); spin_unlock(&bo->bdev->glob->lru_lock); } reservation_object_unlock(bo->resv); } Which likely faults on reading bo->mem.placement since the address is bogus. The report is from amdgpu_bo_create_kernel because everything is a macro or inlined... :-) Tom > > Christian. > >> >> Tom >> >> >>> >>> Christian. >>> >>> Am 17.09.2018 um 18:47 schrieb Tom St Denis: >>>> On 2018-09-17 12:21 p.m., Tom St Denis wrote: >>>>> (attached). I'll try to bisect in a second. Is anyone aware of this? >>>>> >>>>> Tom >>>> >>>> Bisection led to: >>>> >>>> a327772a5655ff4fb104c8aae6515faa461df466 is the first bad commit >>>> commit a327772a5655ff4fb104c8aae6515faa461df466 >>>> Author: Christian König <christian.koenig at amd.com> >>>> Date:  Fri Sep 14 21:06:50 2018 +0200 >>>> >>>>    drm/amdgpu: drop size check >>>> >>>>    We no don't allocate zero sized kernel BOs any longer. >>>> >>>>    Signed-off-by: Christian König <christian.koenig at amd.com> >>>>    Reviewed-by: Alex Deucher <alexander.deucher at amd.com> >>>> >>>> :040000 040000 265e4fa231d367d354e4c66600b8f98a4d2f04c4 >>>> 3702baaeb2423361dcd7eac8c533edace760ae3e M     drivers >>>> >>>> >>>> As the culprit. >>>> >>>> Cheers, >>>> Tom >>> >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >