On Fri, Jul 01, 2016 at 05:58:18PM +0100, Dave Gordon wrote: > On 30/06/16 13:49, Tvrtko Ursulin wrote: > > > >On 30/06/16 11:22, Chris Wilson wrote: > >>On Thu, Jun 30, 2016 at 09:50:20AM +0100, Tvrtko Ursulin wrote: > >>> > >>>On 30/06/16 02:35, Hong Liu wrote: > >>>>Return the allocated request pointer directly to remove > >>>>the double pointer parameter. > >>>> > >>>>Signed-off-by: Hong Liu <hong.liu@xxxxxxxxx> > >>>>--- > >>>> drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------------ > >>>> 1 file changed, 7 insertions(+), 18 deletions(-) > >>>> > >>>>diff --git a/drivers/gpu/drm/i915/i915_gem.c > >>>>b/drivers/gpu/drm/i915/i915_gem.c > >>>>index 1d98782..9881455 100644 > >>>>--- a/drivers/gpu/drm/i915/i915_gem.c > >>>>+++ b/drivers/gpu/drm/i915/i915_gem.c > >>>>@@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct kref > >>>>*req_ref) > >>>> kmem_cache_free(req->i915->requests, req); > >>>> } > >>>> > >>>>-static inline int > >>>>+static inline struct drm_i915_gem_request * > >>>> __i915_gem_request_alloc(struct intel_engine_cs *engine, > >>>>- struct i915_gem_context *ctx, > >>>>- struct drm_i915_gem_request **req_out) > >>>>+ struct i915_gem_context *ctx) > >>>> { > >>>> struct drm_i915_private *dev_priv = engine->i915; > >>>> unsigned reset_counter = > >>>>i915_reset_counter(&dev_priv->gpu_error); > >>>> struct drm_i915_gem_request *req; > >>>> int ret; > >>>> > >>>>- if (!req_out) > >>>>- return -EINVAL; > >>>>- > >>>>- *req_out = NULL; > >>>>- > >>>> /* ABI: Before userspace accesses the GPU (e.g. execbuffer), > >>>>report > >>>> * EIO if the GPU is already wedged, or EAGAIN to drop the > >>>>struct_mutex > >>>> * and restart. > >>>> */ > >>>> ret = i915_gem_check_wedge(reset_counter, > >>>>dev_priv->mm.interruptible); > >>>> if (ret) > >>>>- return ret; > >>>>+ return ERR_PTR(ret); > >>>> > >>>> req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL); > >>>> if (req == NULL) > >>>>- return -ENOMEM; > >>>>+ return ERR_PTR(-ENOMEM); > >>>> > >>>> ret = i915_gem_get_seqno(engine->i915, &req->seqno); > >>>> if (ret) > >>>>@@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct > >>>>intel_engine_cs *engine, > >>>> if (ret) > >>>> goto err_ctx; > >>>> > >>>>- *req_out = req; > >>>>- return 0; > >>>>+ return req; > >>>> > >>>> err_ctx: > >>>> i915_gem_context_unreference(ctx); > >>>> err: > >>>> kmem_cache_free(dev_priv->requests, req); > >>>>- return ret; > >>>>+ return ERR_PTR(ret); > >>>> } > >>>> > >>>> /** > >>>>@@ -3067,13 +3060,9 @@ struct drm_i915_gem_request * > >>>> i915_gem_request_alloc(struct intel_engine_cs *engine, > >>>> struct i915_gem_context *ctx) > >>>> { > >>>>- struct drm_i915_gem_request *req; > >>>>- int err; > >>>>- > >>>> if (ctx == NULL) > >>>> ctx = engine->i915->kernel_context; > >>>>- err = __i915_gem_request_alloc(engine, ctx, &req); > >>>>- return err ? ERR_PTR(err) : req; > >>>>+ return __i915_gem_request_alloc(engine, ctx); > >>>> } > >>>> > >>>> struct drm_i915_gem_request * > >>>> > >>> > >>>Looks good to me. And have this feeling I've seen this somewhere before. > >> > >>Several times. This is not the full tidy, nor does it realise the > >>ramifactions of request alloc through the stack. > > > >Hm I can't spot that it is doing anything wrong or making anything > >worse. You don't want to let the small cleanup in? > > > >Regards, > >Tvrtko > > It ought to make almost no difference, because the *only* place the > inner function is called is from the outer one, which passes a > pointer to a local for the returned object; and the inner one is > then inlined, so the compiler doesn't actually put it on the stack > and call to the inner allocator anyway. > > Strangely, however, with this change the code becomes ~400 bytes bigger! > > Disassembly reveals that while the code for the externally-callable > outer function is indeed almost identical, a second copy of it has > also been inlined at the one callsite in this file: > > __i915_gem_object_sync() ... > req = i915_gem_request_alloc(to, NULL); > > I don't think that's a critical path and would rather have 400 bytes > smaller codespace. We can get that back by adding /noinline/ to the > outer function i915_gem_request_alloc() (not, of course, to the > inner one, that definitely *should* be inline). __i915_gem_object_sync() should not be calling i915_gem_request_alloc(). That's the issue with this patch, your patch and John's patch. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx