Op 21-11-12 14:27, Thomas Hellstrom schreef: > On 11/21/2012 02:12 PM, Maarten Lankhorst wrote: >> Op 21-11-12 13:42, Thomas Hellstrom schreef: >>> On 11/21/2012 12:38 PM, Maarten Lankhorst wrote: >>>> Hey, >>>> >>>> Op 20-11-12 16:08, Thomas Hellstrom schreef: >>>>> On 11/20/2012 02:13 PM, Maarten Lankhorst wrote: >>>>>> Op 20-11-12 13:03, Thomas Hellstrom schreef: >>>>>>> On 11/20/2012 12:33 PM, Maarten Lankhorst wrote: >>>>>>>> Op 20-11-12 08:48, Thomas Hellstrom schreef: >>>>>>>>> On 11/19/2012 04:33 PM, Maarten Lankhorst wrote: >>>>>>>>>> Op 19-11-12 16:04, Thomas Hellstrom schreef: >>>>>>>>>>> On 11/19/2012 03:17 PM, Thomas Hellstrom wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> This patch looks mostly good, although I think ttm_bo_cleanup_refs becomes overly complicated: >>>>>>>>>>>> Could this do, or am I missing something? >>>>>>>>>>>> >>>>>>>>>>> Actually, my version is bad, because ttm_bo_wait() is called with the lru lock held. >>>>>>>>>>> >>>>>>>>>>> /Thomas >>>>>>>>>> Oh digging through it made me remember why I had to release the reservation early and >>>>>>>>>> had to allow move_notify to be called without reservation. >>>>>>>>>> >>>>>>>>>> Fortunately move_notify has a NULL parameter, which is the only time that happens, >>>>>>>>>> so you can still check do BUG_ON(mem != NULL && !ttm_bo_reserved(bo)); in your >>>>>>>>>> move_notify handler. >>>>>>>>>> >>>>>>>>>> 05/10 removed the loop and assumed no new fence could be attached after the driver has >>>>>>>>>> declared the bo dead. >>>>>>>>>> >>>>>>>>>> However, at that point it may no longer hold a reservation to confirm this, that's why >>>>>>>>>> I moved the cleanup to be done in the release_list handler. It could still be done in >>>>>>>>>> ttm_bo_release, but we no longer have a reservation after we waited. Getting >>>>>>>>>> a reservation can fail if the bo is imported for example. >>>>>>>>>> >>>>>>>>>> While it would be true that in that case a new fence may be attached as well, that >>>>>>>>>> would be less harmful since that operation wouldn't involve this device, so the >>>>>>>>>> ttm bo can still be removed in that case. When that time comes I should probably >>>>>>>>>> fix up that WARN_ON(ret) in ttm_bo_cleanup_refs. :-) >>>>>>>>>> >>>>>>>>>> I did add a WARN_ON(!atomic_read(&bo->kref.refcount)); to >>>>>>>>>> ttm_bo_reserve and ttm_eu_reserve_buffers to be sure nothing is done on the device >>>>>>>>>> itself. If that is too paranoid, those WARN_ON's could be dropped. I prefer to leave them >>>>>>>>>> in for a kernel release or 2. But according to the rules that would be the only time you >>>>>>>>>> could attach a new fence and trigger the WARN_ON for now.. >>>>>>>>> Hmm, I'd appreciate if you could group patches with functional changes that depend on eachother togeteher, >>>>>>>>> and "this is done because ...", which makes it much easier to review, (and to follow the commit history in case >>>>>>>>> something goes terribly wrong and we need to revert). >>>>>>>>> >>>>>>>>> Meanwhile I'll take a look at the final ttm_bo.c and see if I can spot any culprits. >>>>>>>>> >>>>>>>>> In general, as long as a bo is on a LRU list, we must be able to attach fences because of accelerated eviction. >>>>>>>> I thought it was deliberately designed in such a way that it was kept on the lru list, >>>>>>>> but since it's also on the ddestroy list it won't start accelerated eviction, >>>>>>>> since it branches into cleanup_refs early, and lru_lock still protects all the list entries. >>>>>>> I used bad wording. I meant that unbinding might be accelerated, but currently (quite inefficiently) >>>>>>> do synchronized unbinding, assuming that only the CPU can do that. When we start to support >>>>>>> unsynchronized moves, we need to be able to attach fences at least at the last move_notify(bo, NULL); >>>>>> Would you need to wait in that case on fence_wait being completed before calling move_notify? >>>>>> >>>>>> If not, you would still only need to perform one wait, but you'd have to make sure move_notify only gets >>>>>> called by 1 thread before checking the fence pointer and performing a wait. At that point you still hold the >>>>>> lru_lock though, so it shouldn't be too hard to make something safe. >>>>> I think typically a driver that wants to implement asynchronous moves don't want to wait before calling >>>>> move_notify, but may wait in move_notify or move. Typically (upcoming vmwgfx) it would invalidate the buffer in move_notify(bo, NULL), attach a fence and then use the normal delayed destroy to wait on that fence before destroying the buffer. >>>>> >>>>> Otherwise, since binds / unbinds are handled in the GPU command stream there's never any need to wait for moves except when there's a CPU >>>>> access. >>>> Well, nouveau actually needs fence_wait to finish first, since vm changes are out of band. >>>> But I guess it should be possible to attach it as work to the fence when it's signaled, and I >>>> may want to do something like that already for performance reasons in a different place, >>>> so I guess it doesn't matter. >>> Actions to be performed on fence signaling tend to be very cpu consuming, I think due to the context switches involved. >>> We had to replace that in the old psb driver and batch things like TTM does instead. >>> >>> Also remember that TTM fences are not required to signal in finite time unless fence_flush is called. >>> >>> I think nouveau doesn't use fence irqs to signal its fences. >>> >>>> Is calling move_notify(bo, NULL) legal and a noop the second time? >>> I see no fundamental reason why it shouldn't be OK, although we might need to patch drivers to cope with it. >>> >>>> That would save a flag in the bo to check if it's called already, >>>> although I suppose we could always define a TTM_BO_PRIV_FLAG_* for it otherwise. >>>> >>>> move_notify might end up being called with the lru_lock held, but that shouldn't be a problem. >>> I don't think that's a good idea. Drivers sleeping in move_notify will need to release the spinlock, and that means it's >>> better to release it before move_notify is called. >> Is the only sleeping being done on fences? In that case we might wish to split it up in 2 pieces for destruction, >> the piece that runs immediately, and a piece to run after the new fence has signaled (current behavior). >> >> Nouveau needs the final move_notify unmap to be called after object is idle, like it is now. It doesn't need >> to attach a new fence. > > In that case it might be best to worry about asynchronous stuff later? > We will eventually implement it on the new vmwgfx hardware revision, but it's not ready yet. > > /Thomas Ok sounds good. In that case what do you want me to change from the first 4 patches apart from more verbose commit messages? - 03/10 I got that I need to re-add the list_empty check after -EBUSY was returned in evict_mem_first. Also PATCH 05/10 cleans up the spinning in ttm_bo_cleanup_refs, so I hope it's ok that it's a big ugly in 04/10, as long as it doesn't result in any new bugs being introduced. ~Maarten PS: I did a plain rebase of my git tree to deal with the conflicts in drm-next. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel