On Sun, Jun 21, 2020 at 02:03:07AM -0400, Andrey Grodzovsky wrote: > entity->rq becomes null aftre device unplugged so just return early > in that case. > > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> That looks very deep in amdgpu internals ... how do you even get in here after the device is fully unplugged on the sw side? Is this amdkfd doing something stupid because entirely unaware of what amdgpu has done? Something else? Just feels like this is just duct-taping over a more fundamental problem, after hotunplug no one should be able to even submit anything new, or do bo moves, or well anything really. -Daniel > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 21 ++++++++++++++++----- > 1 file changed, 16 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c > index 8d9c6fe..d252427 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c > @@ -24,6 +24,7 @@ > #include "amdgpu_job.h" > #include "amdgpu_object.h" > #include "amdgpu_trace.h" > +#include <drm/drm_drv.h> > > #define AMDGPU_VM_SDMA_MIN_NUM_DW 256u > #define AMDGPU_VM_SDMA_MAX_NUM_DW (16u * 1024u) > @@ -94,7 +95,12 @@ static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p, > struct drm_sched_entity *entity; > struct amdgpu_ring *ring; > struct dma_fence *f; > - int r; > + int r, idx; > + > + if (!drm_dev_enter(p->adev->ddev, &idx)) { > + r = -ENODEV; > + goto nodev; > + } > > entity = p->immediate ? &p->vm->immediate : &p->vm->delayed; > ring = container_of(entity->rq->sched, struct amdgpu_ring, sched); > @@ -104,7 +110,7 @@ static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p, > WARN_ON(ib->length_dw > p->num_dw_left); > r = amdgpu_job_submit(p->job, entity, AMDGPU_FENCE_OWNER_VM, &f); > if (r) > - goto error; > + goto job_fail; > > if (p->unlocked) { > struct dma_fence *tmp = dma_fence_get(f); > @@ -118,10 +124,15 @@ static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p, > if (fence && !p->immediate) > swap(*fence, f); > dma_fence_put(f); > - return 0; > > -error: > - amdgpu_job_free(p->job); > + r = 0; > + > +job_fail: > + drm_dev_exit(idx); > +nodev: > + if (r) > + amdgpu_job_free(p->job); > + > return r; > } > > -- > 2.7.4 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel