On Mon, Jan 03, 2022 at 11:48:25AM +0100, Christian König wrote: > Am 22.12.21 um 22:05 schrieb Daniel Vetter: > > On Tue, Dec 07, 2021 at 01:33:48PM +0100, Christian König wrote: > > > This function allows to replace fences from the shared fence list when > > > we can gurantee that the operation represented by the original fence has > > > finished or no accesses to the resources protected by the dma_resv > > > object any more when the new fence finishes. > > > > > > Then use this function in the amdkfd code when BOs are unmapped from the > > > process. > > > > > > Signed-off-by: Christian König <christian.koenig@xxxxxxx> > > > --- > > > drivers/dma-buf/dma-resv.c | 43 ++++++++++++++++ > > > .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 49 +++---------------- > > > include/linux/dma-resv.h | 2 + > > > 3 files changed, 52 insertions(+), 42 deletions(-) > > > > > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c > > > index 4deea75c0b9c..a688dbded3d3 100644 > > > --- a/drivers/dma-buf/dma-resv.c > > > +++ b/drivers/dma-buf/dma-resv.c > > > @@ -284,6 +284,49 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence) > > > } > > > EXPORT_SYMBOL(dma_resv_add_shared_fence); > > > +/** > > > + * dma_resv_replace_fences - replace fences in the dma_resv obj > > > + * @obj: the reservation object > > > + * @context: the context of the fences to replace > > > + * @replacement: the new fence to use instead > > > + * > > > + * Replace fences with a specified context with a new fence. Only valid if the > > > + * operation represented by the original fences is completed or has no longer > > > + * access to the resources protected by the dma_resv object when the new fence > > > + * completes. > > > + */ > > > +void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context, > > > + struct dma_fence *replacement) > > > +{ > > > + struct dma_resv_list *list; > > > + struct dma_fence *old; > > > + unsigned int i; > > > + > > > + dma_resv_assert_held(obj); > > > + > > > + write_seqcount_begin(&obj->seq); > > > + > > > + old = dma_resv_excl_fence(obj); > > > + if (old->context == context) { > > > + RCU_INIT_POINTER(obj->fence_excl, dma_fence_get(replacement)); > > > + dma_fence_put(old); > > > + } > > > + > > > + list = dma_resv_shared_list(obj); > > > + for (i = 0; list && i < list->shared_count; ++i) { > > > + old = rcu_dereference_protected(list->shared[i], > > > + dma_resv_held(obj)); > > > + if (old->context != context) > > > + continue; > > > + > > > + rcu_assign_pointer(list->shared[i], dma_fence_get(replacement)); > > > + dma_fence_put(old); > > Since the fences are all guaranteed to be from the same context, maybe we > > should have a WARN_ON(__dma_fence_is_later()); here just to be safe? > > First of all happy new year! Happy new year to you too! Also I'm only still catching up. > Then to answer your question, no :) > > This here is the case where we replace an preemption fence with a VM page > table update fence. So both fences are not from the same context. > > But since you ask that means that I somehow need to improve the > documentation. Hm yeah then I'm confused, since right above you have the context check. And I thought if the contexts are equal, then the fences must be ordered, and since you're adding a new one it must be a later fences. But now you're saying this is to replace a fence with a totally different context one (which can totally make sense for the special fences compute mode contexts all need), but then I honestly don't get why you even check for the context. Maybe more docs help explain what's going on, or maybe we should have the is_later check only if the new fences is from the same context. amdkfd might not benefit, but this is a new generic interface and other drivers might horrendously screw this up :-) Plus then a big comment that if it's a different fence timeline context the driver must guarantee that the new fence is guaranteed to signal after anything we're replacing here. I think it might also be good to just include the specific amdkfd use case with a short intro to wth are preempt-ctx and page table fences, to explain when this function is actually useful. It's definitely a very special case function, and I'm worried driver authors might come up with creative abuses for it that cause trouble. -Daniel > > Regards, > Christian. > > > > > With that added: > > > > Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx> > > > > > + } > > > + > > > + write_seqcount_end(&obj->seq); > > > +} > > > +EXPORT_SYMBOL(dma_resv_replace_fences); > > > + > > > /** > > > * dma_resv_add_excl_fence - Add an exclusive fence. > > > * @obj: the reservation object > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > > > index 71acd577803e..b558ef0f8c4a 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > > > @@ -236,53 +236,18 @@ void amdgpu_amdkfd_release_notify(struct amdgpu_bo *bo) > > > static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo, > > > struct amdgpu_amdkfd_fence *ef) > > > { > > > - struct dma_resv *resv = bo->tbo.base.resv; > > > - struct dma_resv_list *old, *new; > > > - unsigned int i, j, k; > > > + struct dma_fence *replacement; > > > if (!ef) > > > return -EINVAL; > > > - old = dma_resv_shared_list(resv); > > > - if (!old) > > > - return 0; > > > - > > > - new = kmalloc(struct_size(new, shared, old->shared_max), GFP_KERNEL); > > > - if (!new) > > > - return -ENOMEM; > > > - > > > - /* Go through all the shared fences in the resevation object and sort > > > - * the interesting ones to the end of the list. > > > + /* TODO: Instead of block before we should use the fence of the page > > > + * table update and TLB flush here directly. > > > */ > > > - for (i = 0, j = old->shared_count, k = 0; i < old->shared_count; ++i) { > > > - struct dma_fence *f; > > > - > > > - f = rcu_dereference_protected(old->shared[i], > > > - dma_resv_held(resv)); > > > - > > > - if (f->context == ef->base.context) > > > - RCU_INIT_POINTER(new->shared[--j], f); > > > - else > > > - RCU_INIT_POINTER(new->shared[k++], f); > > > - } > > > - new->shared_max = old->shared_max; > > > - new->shared_count = k; > > > - > > > - /* Install the new fence list, seqcount provides the barriers */ > > > - write_seqcount_begin(&resv->seq); > > > - RCU_INIT_POINTER(resv->fence, new); > > > - write_seqcount_end(&resv->seq); > > > - > > > - /* Drop the references to the removed fences or move them to ef_list */ > > > - for (i = j; i < old->shared_count; ++i) { > > > - struct dma_fence *f; > > > - > > > - f = rcu_dereference_protected(new->shared[i], > > > - dma_resv_held(resv)); > > > - dma_fence_put(f); > > > - } > > > - kfree_rcu(old, rcu); > > > - > > > + replacement = dma_fence_get_stub(); > > > + dma_resv_replace_fences(bo->tbo.base.resv, ef->base.context, > > > + replacement); > > > + dma_fence_put(replacement); > > > return 0; > > > } > > > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h > > > index eebf04325b34..e0be34265eae 100644 > > > --- a/include/linux/dma-resv.h > > > +++ b/include/linux/dma-resv.h > > > @@ -457,6 +457,8 @@ void dma_resv_init(struct dma_resv *obj); > > > void dma_resv_fini(struct dma_resv *obj); > > > int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences); > > > void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence); > > > +void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context, > > > + struct dma_fence *fence); > > > void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence); > > > int dma_resv_get_fences(struct dma_resv *obj, struct dma_fence **pfence_excl, > > > unsigned *pshared_count, struct dma_fence ***pshared); > > > -- > > > 2.25.1 > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch