Am 13.10.2017 um 16:34 schrieb Michel Dänzer: > On 12/10/17 07:11 PM, Christian König wrote: >> Am 12.10.2017 um 18:49 schrieb Michel Dänzer: >>> On 12/10/17 01:00 PM, Michel Dänzer wrote: >>>> [0] I also got this, but I don't know yet if it's related: >>> No, that seems to be a separate issue; I can still reproduce it with the >>> huge page related changes reverted. Unfortunately, it doesn't seem to >>> happen reliably on every piglit run. >> Can you enable KASAN in your kernel, > KASAN caught something else at the beginning of piglit, see the attached > dmesg excerpt. Not sure it's related though. > > amdgpu_job_free_cb+0x13d/0x160 decodes to: > > amd_sched_get_job_priority at .../drivers/gpu/drm/amd/amdgpu/../scheduler/gpu_scheduler.h:182 > > static inline enum amd_sched_priority > amd_sched_get_job_priority(struct amd_sched_job *job) > { > return (job->s_entity->rq - job->sched->sched_rq); <=== > } > > (inlined by) amdgpu_job_free_cb at .../drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:107 > > amdgpu_ring_priority_put(job->ring, amd_sched_get_job_priority(s_job)); Sounds a lot like the code Andres added is buggy somehow. Going to take a look as well. >> and please look up at which line number amdgpu_vm_bo_invalidate+0x88 >> is. > Looks like it's this line: > > if (evicted && bo->tbo.resv == vm->root.base.bo->tbo.resv) { > > Maybe vm or vm->root.base.bo is NULL? Ah, of course! We need to reserve the page directory root when we release it or otherwise we can run into a race with somebody else trying to evict it. Going to send a patch in a minute, Christian.