Hi all, I just sent out two patches that hopefully make the kernel module more robust in the face of page table shadows being swapped out. However, even with those patches, I can still fairly reliably reproduce crashes with a backtrace of the shape amdgpu_cs_ioctl -> amdgpu_vm_update_page_directory -> amdgpu_ttm_bind -> amdgpu_gtt_mgr_alloc The plausible reason for these crashes is that nothing seems to prevent the shadow BOs from being moved between the calls to amdgpu_cs_validate in amdgpu_cs_parser_bos and the calls to amdgpu_ttm_bind. The attached patch has fixed these crashes for me so far, but it's very heavy-handed: it collects all page table shadows and the page directory shadow and adds them all to the reservations for the callers of amdgpu_vm_update_page_directory. I feel like there should be a better way. In part, I wonder why the shadows are needed in the first place. I vaguely recall the discussions about GPU reset and such, but I don't remember why the page tables can't just be rebuilt in some other way. Cheers, Nicolai -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-drm-amd-amdgpu-reserve-shadows-of-page-directory-and.patch Type: text/x-patch Size: 9341 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20161212/c830099c/attachment.bin>