Re: [PATCH 2/5] drm/nouveau: add reservation to nouveau_gem_ioctl_cpu_prep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



op 21-01-14 18:44, Thomas Hellstrom schreef:
On 01/21/2014 04:29 PM, Maarten Lankhorst wrote:
Hey,

op 21-01-14 16:17, Thomas Hellstrom schreef:
Maarten, for this and the other patches in this series,

I seem to recall we have this discussion before?
IIRC I stated that reservation was a too heavy-weight lock to hold to
determine whether a buffer was idle? It's a pretty nasty thing to
build in.

I've sent this patch after determining that this already didn't end up
being heavyweight.
Most places were already using the fence_lock and reservation, I just
fixed up the few
places that didn't hold a reservation while waiting. Converting the
few places that didn't
ended up being trivial, so I thought I'd submit it.
Actually the only *valid* reason for holding a reservation when waiting
for idle is
1) You want to block further command submission on the buffer.
2) You want to switch GPU engine and don't have access to gpu semaphores
/ barriers.

Reservation has the nasty side effect that it blocks command submission
and pins the buffer (in addition now makes the evict list traversals
skip the buffer) which in general is *not* necessary for most wait
cases, so we should instead actually convert the wait cases that don't
fulfill 1) and 2) above in the other direction if we have performance
and latency-reduction in mind. I can't see how a spinlock protecting a
fence pointer or fence list is stopping you from using RW fences as long
as the spinlock is held while manipulating the fence list?

You wish. Fine I'll enumerate all cases of ttm_bo_wait (with the patchset, though) and enumerate if they can be changed to work without reservation or not.

ttm/ttm_bo.c
ttm_bo_cleanup_refs_or_queue: needs reservation and ttm_bo_wait to finish for the direct destroy fastpath, if either fails it needs to be queued. Cannot work without reservation.
ttm_bo_cleanup_refs_and_unlock: already drops reservation to wait, doesn't need to re-acquire. Simply reordering ttm_bo_wait until after re-reserve is enough.
ttm_bo_evict: already has the reservation, cannot be dropped since only trylock is allowed. Dropping reservation would cause badness, cannot be converted.
ttm_bo_move_buffer: called from ttm_bo_validate, cannot drop reservation for same reason as ttm_bo_evict. It might be part of a ticketed reservation so really don't drop lock here.
ttm_bo_synccpu_write_grab: the wait could be converted to be done afterwards, without  fence_lock. But in this case reservation could take the role of fence_lock too,
so no separate fence_lock would be needed.
ttm_bo_swapout: see ttm_bo_evict.

ttm/ttm_bo_util.c:
ttm_bo_move_accel_cleanup: calls ttm_bo_wait, cannot drop lock, see ttm_bo_move_buffer, can be called from that function.

ttm/ttm_bo_vm.c
ttm_bo_vm_fault_idle: I guess you COULD drop the reservation here, but you already had the reservation, so a similar optimization to ttm_bo_synccpu_write_grab could be done without requiring fence_lock.
If you would write it like that, you would end up with a patch similar to drm/nouveau: add reservation to nouveau_gem_ioctl_cpu_prep. I think we should do this, an

Ok, so the core does NOT need fence_lock because we can never drop reservations except in synccpu_write_grab and maybe ttm_bo_vm_fault_idle, but even in those cases reservation is done. So that could be used instead of fence_lock.

nouveau_gem_ioctl_cpu_prep:
Either block on a global spinlock or a local reservation lock. Doesn't matter much which, I don't need the need to keep a global lock for this function...
2 cases can happen in the trylock reservation failure case: buffer is not reserved, so it's not in the process of being evicted. buffer is reserved, which means it's being used in command submission right now, or in one of the functions described above (eg not idle).

nouveau_gem_pushbuf_reloc_apply:
has to call ttm_bo_wait with reservation, cannot be dropped.

So for core ttm and nouveau the fence_lock is never needed, radeon has only 1 function that calls ttm_bo_wait which uses a reservation too. It doesn't need the fence_lock either.

~Maarten

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux