Hi Nicolai, yeah, that is a known issue. You don't necessary need to add all fences from the PD to the released BO, but immediately starting to clear the PTE would be a good idea. amdgpu_gem_object_close() should call amdgpu_vm_clear_freed() if the PD/PT are swapped in at that moment. This leaves only a very small window where the application could access freed up memory while the PTEs are cleared. If we even want to close that one we could let amdgpu_vm_clear_freed() return the fence of the clear operation and add that to the BO in question. Regards, Christian. Am 22.03.2017 um 16:06 schrieb Nicolai Hähnle: > Hi all, > > there's a bit of a puzzle where I'm wondering whether there's a subtle > bug in the amdgpu kernel module. > > Basically, the concern is that a buggy user space driver might trigger > a sequence like this: > > 1. Submit a CS that accesses some BO _without_ adding that BO to the > buffer list. > 2. Free that BO. > 3. Some other task re-uses the memory underlying the BO. > 4. The CS is submitted to the hardware and accesses memory that is now > already in use by somebody else, since there has been no update to the > page tables to reflect the freed BO. > > Obviously there's a user space bug in step 1, but the kernel must > still prevent the conflicting memory accesses, and I don't see where > it does. > > amdgpu_gem_object_close takes a reservation of the BO and the page > directory, but then simply backs off that reservation rather than > adding a fence, which I suspect is necessary. > > I believe that whenever we remove a BO from a VM, we must > unconditionally add the most recent page directory fence(?) to the BO. > Does that sound right? > > Cheers, > Nicolai > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx