On Mon, Jul 28, 2014 at 01:44:12PM -0700, Jesse Barnes wrote: > > +static void i915_request_retire(struct i915_gem_request *rq) > > { > > - list_del(&request->list); > > - i915_gem_request_remove_from_client(request); > > + rq->completed = true; > > + > > + list_del(&rq->list); > > + i915_gem_request_remove_from_client(rq); > > > > - if (request->ctx) > > - i915_gem_context_unreference(request->ctx); > > + if (rq->ctx) { > > + i915_gem_context_unreference(rq->ctx); > > + rq->ctx = NULL; > > + } > > > > - kfree(request); > > + i915_request_put(rq); > > } > > I wonder if we can somehow always use this function instead of having > both obj and request retire functions. Maybe if we moved obj retire > into the rendering sync functions we'd have a little less pseudo > duplication. I kept it as it was as I was trying to avoid increasing the number of calls to ring->get_seqno. Elsewhere converting to a fence is good at reducing the number of calls (as that completed:1 is likely to catch most cases). But here I think we really do want to treat this en masse. > request_retire does a put, where is the corresponding get so that we > don't lose the object when the lock is dropped here or in the > nonblocking wait? Or should request_retire not do a put? Ownership flows from: intel_ring_begin -> i915_add_request -> retire_request. Everyone else has to grab their own references. > > +static int > > +i915_request_sync(struct i915_gem_request *rq, > > + struct intel_engine_cs *to, > > + struct drm_i915_gem_object *obj) > > +{ > > + int ret, idx; > > + > > + if (to == NULL) > > + return i915_wait_request(rq); > > + > > + /* XXX this is broken by VEBOX+ */ > > What does this comment mean? How does VEBOX break this? Does it not > have semaphore support or something? It's a very old comment. In theory it should have been fixed by the recent gen8 semaphore work. > > @@ -3038,44 +3203,35 @@ out: > > */ > > int > > i915_gem_object_sync(struct drm_i915_gem_object *obj, > > - struct intel_engine_cs *to) > > + struct intel_engine_cs *to, > > + bool readonly) > > { > > It might be nice to have sync_read/sync_write functions instead, since > looking up bool args or unnamed enums is a pain. Not convinced since it is used in a single location and two function calls would look unelegant - but we could switch to PROT_READ | PROT_WRITE throughout. > > static bool > > ring_idle(struct intel_engine_cs *ring, u32 seqno) > > { > > return (list_empty(&ring->request_list) || > > - i915_seqno_passed(seqno, ring_last_seqno(ring))); > > + __i915_seqno_passed(seqno, ring_last_seqno(ring))); > > } > > Unrelated question: why do we have two checks here? Shouldn't the > seqno check always be sufficient? Or is the list_empty check for the > uninitialized case? We can't test the seqno without testing that the pointer is valid - that's the first check. And since we retire lazily we cannot rely on the request_list being empty itself. > > static int > > -intel_ring_alloc_seqno(struct intel_engine_cs *ring) > > +intel_ring_alloc_request(struct intel_engine_cs *ring) > > { > > - if (ring->outstanding_lazy_seqno) > > - return 0; > > + struct i915_gem_request *rq; > > + int ret; > > > > - if (ring->preallocated_lazy_request == NULL) { > > - struct drm_i915_gem_request *request; > > + if (ring->preallocated_request) > > + return 0; > > > > - request = kmalloc(sizeof(*request), GFP_KERNEL); > > - if (request == NULL) > > - return -ENOMEM; > > + rq = kmalloc(sizeof(*rq), GFP_KERNEL); > > + if (rq == NULL) > > + return -ENOMEM; > > > > - ring->preallocated_lazy_request = request; > > + ret = i915_gem_get_seqno(ring->dev, &rq->seqno); > > + if (ret) { > > + kfree(rq); > > + return ret; > > } > > > > - return i915_gem_get_seqno(ring->dev, > > &ring->outstanding_lazy_seqno); > > + kref_init(&rq->kref); > > + rq->ring = ring; > > + rq->completed = false; > > + > > + ring->preallocated_request = rq; > > + return 0; > > } > > Makes me wonder if we should have our own slab for the objs. Might > save a bit of mem and/or perf? But then could reduce our cache hit > rate, dunno. It's just a case of whether we are in a general slab or a named slab. Having a named slab is nice for debugging, and we definitely have a high enough reuse rate to justify our own slab. Elsewhere some useful comments deleted to save electrons :-) -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx