Am 11.09.2017 um 11:02 schrieb Monk Liu: > need to unreserve ttm bo before "cs_add_fence" and "entity_push_job" > otherwise there will be deadlock between "recover_vram_from_shadow" > and previous two routines on the ttm bo's resv lock. > > Change-Id: I9c18b677e474ce5b45a9cc24da3fa255d77b3d44 > Signed-off-by: Monk Liu <Monk.Liu at amd.com> Yeah, I was thinking about this as well. Problem is this won't work without further changes. > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 35 ++++++++++++++++++---------------- > 1 file changed, 19 insertions(+), 16 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index c1f1ae9..823ac12 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -736,26 +736,12 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p) > /** > * cs_parser_fini() - clean parser states > * @parser: parser structure holding parsing context. > - * @error: error number > - * > - * If error is set than unvalidate buffer, otherwise just free memory > - * used by parsing context. > **/ > -static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error, bool backoff) > +static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser) > { > struct amdgpu_fpriv *fpriv = parser->filp->driver_priv; > unsigned i; > > - if (!error) { > - amdgpu_vm_move_pt_bos_in_lru(parser->adev, &fpriv->vm); > - > - ttm_eu_fence_buffer_objects(&parser->ticket, > - &parser->validated, > - parser->fence); > - } else if (backoff) { > - ttm_eu_backoff_reservation(&parser->ticket, > - &parser->validated); > - } > fence_put(parser->fence); > > if (parser->ctx) > @@ -1075,6 +1061,16 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, > job->owner = p->filp; > job->fence_ctx = entity->fence_context; > p->fence = fence_get(&job->base.s_fence->finished); > + > + /* hook sched fence to all BOs' reservation in validated list > + * and unreserve them. > + * > + * we unreserve at here is because otherwise > + * there'll be deadlock between ctx_add_fence/sched_entity_push_job > + * and gpu_reset routine's recover_bo_from_shadow on PD/PTEs' ttm bo lock > + */ > + ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence); The order of job->base.s_fence and preventing concurrent CS was guaranteed by the VM PD being reserved, that is now no longer the case and we need a new lock for protection. I suggest to add another lock to the context which we grab in amdgpu_ctx_get() and release in amdgpu_ctx_put(). Regards, Christian. > + > r = amdgpu_ctx_add_fence(p->ctx, ring, p->fence, &seq); > if (r) { > fence_put(p->fence); > @@ -1095,6 +1091,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) > { > struct amdgpu_device *adev = dev->dev_private; > union drm_amdgpu_cs *cs = data; > + struct amdgpu_fpriv *fpriv; > struct amdgpu_cs_parser parser = {}; > bool reserved_buffers = false; > int i, r; > @@ -1104,6 +1101,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) > > parser.adev = adev; > parser.filp = filp; > + fpriv = filp->driver_priv; > > r = amdgpu_cs_parser_init(&parser, data); > if (r) { > @@ -1141,7 +1139,12 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) > r = amdgpu_cs_submit(&parser, cs); > > out: > - amdgpu_cs_parser_fini(&parser, r, reserved_buffers); > + if (!r) > + amdgpu_vm_move_pt_bos_in_lru(parser.adev, &fpriv->vm); > + else if (reserved_buffers) > + ttm_eu_backoff_reservation(&parser.ticket, &parser.validated); > + > + amdgpu_cs_parser_fini(&parser); > return r; > } >