On Sat, 2011-11-19 at 11:07 +0100, Thomas Hellstrom wrote: > On 11/19/2011 01:26 AM, Ben Skeggs wrote: > > On Fri, 2011-11-18 at 23:48 +0100, Thomas Hellstrom wrote: > > > >> On 11/18/2011 06:26 PM, Ben Skeggs wrote: > >> > >>> On Fri, 2011-11-18 at 15:30 +0100, Thomas Hellstrom wrote: > >>> > >>> > >>>> On 11/18/2011 02:15 PM, Ben Skeggs wrote: > >>>> > >>>> > >>>>> On Fri, 2011-11-18 at 08:57 +0100, Thomas Hellstrom wrote: > >>>>> > >>>>> > >>>>> > >>>>>> Jerome, > >>>>>> > >>>>>> I don't like this change for the following reasons > >>>>>> > >>>>>> > >>>>>> > >>>>> -snip- > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>> One can take advantage of move notify callback but, there are > >>>>>>> corner case where bind/unbind might be call without move notify > >>>>>>> being call (in error path mostly). So to make sure that each > >>>>>>> virtual address space have a consistent view of wether a buffer > >>>>>>> object is backed or not by system page it's better to pass the > >>>>>>> buffer object to bind/unbind. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>> As I discussed previously with Jerome on IRC, I believe the move_notify > >>>>> hook is sufficient. I fixed a couple of issues back with it back when I > >>>>> implemented support for this in nouveau. > >>>>> > >>>>> Jerome did point out two issues however, which I believe can be solved > >>>>> easily enough. > >>>>> > >>>>> The first was that the error path after move_notify() has been called > >>>>> doesn't reverse the move_notify() too, leaving incorrect page table > >>>>> entries. This, I think, could easily be solved by swapping bo->mem and > >>>>> mem, and re-calling move_notify() to have the driver reverse whatever it > >>>>> did. > >>>>> > >>>>> The second issue is that apparently move_notify() doesn't get called in > >>>>> the destroy path. Nouveau's GEM layer takes care of this for our > >>>>> userspace bos, and we don't use any kernel bos that are mapped into a > >>>>> channel's address space so we don't hit it. This can probably be solved > >>>>> easily enough too I expect. > >>>>> > >>>>> Ben. > >>>>> > >>>>> > >>>>> > >>>>> > >>>> OK. I understand. Surely if a move_notify is missing somewhere, that can > >>>> easily be fixed. > >>>> > >>>> It might be good if we can outline how the virtual tables are set up. In > >>>> my world, the following would happen: > >>>> > >>>> 1) Pre command submission: > >>>> > >>>> a) reserve bo > >>>> b) call ttm_bo_validate to set placement. move_notify will tear down any > >>>> existing GPU page tables for the bo. > >>>> c) Set up GPU page tables. > >>>> > >>>> > >>> Nouveau doesn't do this exactly. I wanted to avoid having to check if a > >>> bo was bound to a particular address space on every command submission. > >>> > >>> > >> It perhaps might be a good idea to revisit this? > >> I think using move_notify to set up a new placement before the data has > >> actually been moved is very fragile and not the intended use. That would > >> also save you from having to handle error paths. Also how do you handle > >> swapouts? > >> > > I don't see how it's fragile, there's only the one error path after that > > point that needs to undo it. And, what *is* the expected use then? I > > see it as "tell the driver the buffer is moving so it can do whatever is > > necessary as a result". > > > > Swapouts are a missed case though, indeed. > > > > > >> A quick check in c) that the virtual map hasn't been torn down by a > >> move_notify and, in that case, rebuild it would probably be next to > >> free. The virtual GPU mapping would then be valid only after validation > >> and while the bo is fenced or pinned. > >> > > This alone doesn't solve the swapouts issue either unless you're also > > wanting us to unmap once a buffer becomes unfenced, which would lead to > > loads of completely unnecessary map/unmaps. > > > > Ben. > > > > I think you misunderstand me a little. > The swapout issue should be handled by calling a move_notify() to kill > the virtual GPU mappings. > > However, when moving data that has page tables pointing to it, one > should (IMHO) do: > > 1) Kill the old page tables > 2) Move the object > 3) Set up new page tables on demand. > > This is done by TTM for CPU page tables and I think that should be done > also for GPU page tables. move_notify() should be responsible for 1), > The driver execbuf for 3), and 3) needs only to be done for the > particular ring / fifo which is executing commands that touch the buffer. > > This leaves a clear and well-defined requirement on TTM: > Notify the driver when it needs to kill its GPU page tables. Ok. This I don't really object to really. I read the previous mail as that GPU mappings should only exist as long as the buffer is fenced, which would be a ridiculous amount of overhead. > > With the latest patch from jerome: > Notify the driver when it needs to rebuild it page tables. Also on error > paths, but not for swapins because no driver will probably set up GPU > page tables to SYSTEM_MEMORY. > > This is what I mean with fragile, and I much rather see the other approach. > > Ben, I didn't fully understand why you didn't want to use that approach. > Did you see a significant overhead with it. Now I'm more clear on what you meant, no, it wouldn't be a lot of overhead. However, move_notify() was never intended for the use you're proposing now, or the new ttm_mem_reg would never have been being passed in as a parameter... Really though, I also don't see why move_notify() can't just be called in all the situations a placement change happens (moves/deletion/swapout/swapin, whatever) and let the driver do whatever it needs to as a result. That seems just as well specified as changing move_notify() to mean "remove all gpu mappings". I totally agree that there's currently a couple of cases we miss here, which should indeed be fixed. However, I'm fine if there's a strong preference for your proposal. Thanks, Ben. > > Thanks, > /Thomas > > > > > > > > > > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel