Hi Jerry, On 23.03.2017 03:26, Zhang, Jerry (Junwei) wrote: > On 03/22/2017 11:06 PM, Nicolai Hähnle wrote: >> Hi all, >> >> there's a bit of a puzzle where I'm wondering whether there's a subtle >> bug in >> the amdgpu kernel module. >> >> Basically, the concern is that a buggy user space driver might trigger a >> sequence like this: >> >> 1. Submit a CS that accesses some BO _without_ adding that BO to the >> buffer list. >> 2. Free that BO. > > The user space should call unmap when free a BO, as my understanding. > In this case, it will call amdgpu_gem_va_update_vm() to clear the PTE > related to the BO. > Right? > > Or you just imagine this scenery that there is no unmap? I'm thinking of the scenario without an unmap, i.e. broken / malicious user space. I haven't looked into the unmap case, I will. I have a WIP patch for this, will give it a proper test drive later. Cheers, Nicolai > > Jerry > >> 3. Some other task re-uses the memory underlying the BO. >> 4. The CS is submitted to the hardware and accesses memory that is now >> already >> in use by somebody else, since there has been no update to the page >> tables to >> reflect the freed BO. >> >> Obviously there's a user space bug in step 1, but the kernel must >> still prevent >> the conflicting memory accesses, and I don't see where it does. >> >> amdgpu_gem_object_close takes a reservation of the BO and the page >> directory, >> but then simply backs off that reservation rather than adding a fence, >> which I >> suspect is necessary. >> >> I believe that whenever we remove a BO from a VM, we must >> unconditionally add >> the most recent page directory fence(?) to the BO. Does that sound right? >> >> Cheers, >> Nicolai >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte.