On 12.12.2016 19:22, Christian König wrote: > Am 12.12.2016 um 16:20 schrieb Nicolai Hähnle: >> Hi all, >> >> I just sent out two patches that hopefully make the kernel module more >> robust in the face of page table shadows being swapped out. >> >> However, even with those patches, I can still fairly reliably >> reproduce crashes with a backtrace of the shape >> >> amdgpu_cs_ioctl >> -> amdgpu_vm_update_page_directory >> -> amdgpu_ttm_bind >> -> amdgpu_gtt_mgr_alloc >> >> The plausible reason for these crashes is that nothing seems to >> prevent the shadow BOs from being moved between the calls to >> amdgpu_cs_validate in amdgpu_cs_parser_bos and the calls to >> amdgpu_ttm_bind. > > The shadow BOs use the same reservation object than the real BOs. So as > long as the real BOs can't be evicted the shadows can't be evicted either. Ah okay. I don't see where the page table BOs are added to the buffer list. Do they also share the reservation object with the page directory? >> The attached patch has fixed these crashes for me so far, but it's >> very heavy-handed: it collects all page table shadows and the page >> directory shadow and adds them all to the reservations for the callers >> of amdgpu_vm_update_page_directory. > > That is most likely just a timing change, cause the shadows should end > up in the duplicates list anyway. So the patch shouldn't have any effect. Okay, so the reason for the remaining crash is still unclear at least for me. Cheers, Nicolai > >> >> I feel like there should be a better way. In part, I wonder why the >> shadows are needed in the first place. I vaguely recall the >> discussions about GPU reset and such, but I don't remember why the >> page tables can't just be rebuilt in some other way. > > It's just the simplest and fastest way to keep a copy of the page tables > around. > > The problem with rebuilding the page tables from the mappings is that > the housekeeping structures already have the future state when a reset > happens, not the state we need to rebuild the tables. > > We could obviously change the housekeeping a bit to keep both states, > but that would complicate mapping and unmapping of BOs significantly. > > Regards, > Christian. > >> >> Cheers, >> Nicolai >> >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > >