Am 17.09.2018 um 20:01 schrieb Tom St Denis: > On 2018-09-17 1:55 p.m., Christian König wrote: >> Am 17.09.2018 um 19:50 schrieb Tom St Denis: >>> On 2018-09-17 1:45 p.m., Christian König wrote: >>>> Mhm, not the slightest idea. >>>> >>>> That nearly looks like adev->stolen_vga_memory already contains >>>> something. >>> >>> Nope, >>> >>> [  51.564605] >>>adev->stolen_vga_memory == (null) >>> [  51.564619] kasan: CONFIG_KASAN_INLINE enabled >>> [  51.564877] kasan: GPF could be caused by NULL-ptr deref or user >>> memory access >>> [  51.565071] general protection fault: 0000 [#1] SMP >>> DEBUG_PAGEALLOC KASAN NOPTI >>> [  51.565254] CPU: 6 PID: 3863 Comm: modprobe Not tainted >>> 4.19.0-rc1+ #30 >>> [  51.565425] Hardware name: System manufacturer System Product >>> Name/TUF B350M-PLUS GAMING, BIOS 4011 04/19/2018 >>> [  51.565714] RIP: 0010:amdgpu_bo_create_kernel+0x59/0x1a0 [amdgpu] >>> >>> That's me printing out the value of the value for stolen_vga_memory >>> before the call to allocate it. >> >> What does amdgpu_bo_create_kernel+0x59 points to? > > I've never really got line numbers to work with the kernel but if I > had to guess I'd say right here > > int amdgpu_bo_create_kernel(struct amdgpu_device *adev, >                unsigned long size, int align, >                u32 domain, struct amdgpu_bo **bo_ptr, >                u64 *gpu_addr, void **cpu_addr) > { >     int r; > >     r = amdgpu_bo_create_reserved(adev, size, align, domain, bo_ptr, >                      gpu_addr, cpu_addr); > >     if (r) >        return r; > > *bo_ptr is NULL ===>   amdgpu_bo_unreserve(*bo_ptr); Ah, of course! Thanks for pointing out the obvious, totally forgot that there is still another function in the call chain. Patch to fix is on the list, Christian. > >     return 0; > } > > Which then results in > > static inline void amdgpu_bo_unreserve(struct amdgpu_bo *bo) > { >     ttm_bo_unreserve(&bo->tbo); > } > > Which then passes the address NULL + offsetof(tbo) to ttm_bo_unreserve: > > static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo) > { >        if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { >                spin_lock(&bo->bdev->glob->lru_lock); >                ttm_bo_add_to_lru(bo); > spin_unlock(&bo->bdev->glob->lru_lock); >        } >        reservation_object_unlock(bo->resv); > } > > > Which likely faults on reading bo->mem.placement since the address is > bogus. > > The report is from amdgpu_bo_create_kernel because everything is a > macro or inlined... :-) > > Tom > >> >> Christian. >> >>> >>> Tom >>> >>> >>>> >>>> Christian. >>>> >>>> Am 17.09.2018 um 18:47 schrieb Tom St Denis: >>>>> On 2018-09-17 12:21 p.m., Tom St Denis wrote: >>>>>> (attached). I'll try to bisect in a second. Is anyone aware of >>>>>> this? >>>>>> >>>>>> Tom >>>>> >>>>> Bisection led to: >>>>> >>>>> a327772a5655ff4fb104c8aae6515faa461df466 is the first bad commit >>>>> commit a327772a5655ff4fb104c8aae6515faa461df466 >>>>> Author: Christian König <christian.koenig at amd.com> >>>>> Date:  Fri Sep 14 21:06:50 2018 +0200 >>>>> >>>>>    drm/amdgpu: drop size check >>>>> >>>>>    We no don't allocate zero sized kernel BOs any longer. >>>>> >>>>>    Signed-off-by: Christian König <christian.koenig at amd.com> >>>>>    Reviewed-by: Alex Deucher <alexander.deucher at amd.com> >>>>> >>>>> :040000 040000 265e4fa231d367d354e4c66600b8f98a4d2f04c4 >>>>> 3702baaeb2423361dcd7eac8c533edace760ae3e M     drivers >>>>> >>>>> >>>>> As the culprit. >>>>> >>>>> Cheers, >>>>> Tom >>>> >>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx