Am 12.12.2016 um 16:20 schrieb Nicolai Hähnle: > Hi all, > > I just sent out two patches that hopefully make the kernel module more > robust in the face of page table shadows being swapped out. > > However, even with those patches, I can still fairly reliably > reproduce crashes with a backtrace of the shape > > amdgpu_cs_ioctl > -> amdgpu_vm_update_page_directory > -> amdgpu_ttm_bind > -> amdgpu_gtt_mgr_alloc > > The plausible reason for these crashes is that nothing seems to > prevent the shadow BOs from being moved between the calls to > amdgpu_cs_validate in amdgpu_cs_parser_bos and the calls to > amdgpu_ttm_bind. The shadow BOs use the same reservation object than the real BOs. So as long as the real BOs can't be evicted the shadows can't be evicted either. > > The attached patch has fixed these crashes for me so far, but it's > very heavy-handed: it collects all page table shadows and the page > directory shadow and adds them all to the reservations for the callers > of amdgpu_vm_update_page_directory. That is most likely just a timing change, cause the shadows should end up in the duplicates list anyway. So the patch shouldn't have any effect. > > I feel like there should be a better way. In part, I wonder why the > shadows are needed in the first place. I vaguely recall the > discussions about GPU reset and such, but I don't remember why the > page tables can't just be rebuilt in some other way. It's just the simplest and fastest way to keep a copy of the page tables around. The problem with rebuilding the page tables from the mappings is that the housekeeping structures already have the future state when a reset happens, not the state we need to rebuild the tables. We could obviously change the housekeeping a bit to keep both states, but that would complicate mapping and unmapping of BOs significantly. Regards, Christian. > > Cheers, > Nicolai > > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20161212/3ad9d3b1/attachment-0001.html>