Re: [PATCH 1/2] drm/i915: introduce for_each_engine(), rework for_each_ring()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29/12/15 19:18, Chris Wilson wrote:
On Tue, Dec 29, 2015 at 06:59:57PM +0000, Dave Gordon wrote:
In the discussions around commit 373701b1 (Jani Nikula)
     drm: fix potential dangling else problems in for_each_ macros
Daniel Vetter mooted the idea of a for_each_engine(ring, dev_priv)
macro that didn't use or rely on having an integer index variable.

So here it is; and, for backwards compatibility, an updated version
of for_each_ring() implemented using the same internals. As an example
of the use of this new macro, the functions in intel_uncore.c have
been converted to use it in preference to for_each_ring() wherever
possible. A subsequent patch will convert many of the uses in other
source files.

Cc: Daniel Vetter <daniel@xxxxxxxx>
Cc: Jani Nikula <jani.nikula@xxxxxxxxx>
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Dave Gordon <david.s.gordon@xxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_drv.h     | 32 ++++++++++++++++++++++++++++----
  drivers/gpu/drm/i915/intel_uncore.c |  5 ++---
  2 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cf7e0fc..a6457ee 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1968,10 +1968,34 @@ static inline struct drm_i915_private *guc_to_i915(struct intel_guc *guc)
  	return container_of(guc, struct drm_i915_private, guc);
  }

-/* Iterate over initialised rings */
-#define for_each_ring(ring__, dev_priv__, i__) \
-	for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
-		for_each_if ((((ring__) = &(dev_priv__)->ring[(i__)]), intel_ring_initialized((ring__))))
+/* Helpers for the for_each_* macros below */
+#define	__INIT_ENGINE_PTR(engine, dev_priv)				\
+	({								\
+	 	struct intel_engine_cs *ep = (dev_priv)->ring;		\
+		(engine) = --ep;					\
+	})
+#define	__NEXT_ACTIVE_ENGINE(engine, dev_priv, nr)			\
+	({								\
+		struct intel_engine_cs *end = &(dev_priv)->ring[nr];	\
+		struct intel_engine_cs *ep = (engine);			\
+		bool in_range;						\
+		do {							\
+			in_range = ++(ep) < end;			\
+		} while (in_range && !intel_ring_initialized(ep));	\
+	 	(engine) = ep;						\
+		in_range;						\
+	})
+
+/* Iterate over initialised engines */
+#define for_each_engine(engine, dev_priv)				\
+	for (__INIT_ENGINE_PTR(engine, dev_priv);			\
+	     __NEXT_ACTIVE_ENGINE(engine, dev_priv, I915_NUM_RINGS); )
+
+/* Backwards compatibility: iterate over initialised "rings" */
+#define for_each_ring(engine, dev_priv, ring_id)			\
+	for (__INIT_ENGINE_PTR(engine, dev_priv);			\
+	     __NEXT_ACTIVE_ENGINE(engine, dev_priv, I915_NUM_RINGS) &&	\
+	     ((ring_id) = (engine)->id, true); )

No, that is hard to read and bloats the kernel for no benefit. Even more
so compared against just

Well I agree that __NEXT_ACTIVE_ENGINE() takes a bit of thinking about, but at least it uses clearly named variables that tell you what they stand for, and can be formatted without any line breaks within statements, which IMHO makes it much EASIER to read than then mess of (parentheses), __underscores__, & random->punctuation that we had. And breaking the details out into two helpers makes the actual end-user macro so much clearer; after all, the complicated innards only have to be checked and verified once, here on the mailing list, and then everyone can see what the for_each_engine() macro is doing without having to worry about the GCC magic inside.

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 578f954dba1d..739569458bb7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1967,8 +1967,8 @@ static inline struct drm_i915_private *guc_to_i915(struct intel_guc *guc)

  /* Iterate over initialised rings */
  #define for_each_ring(ring__, dev_priv__, i__) \
-       for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
-               for_each_if ((((ring__) = &(dev_priv__)->ring[(i__)]), intel_engine_initialized((ring__))))
+       for ((ring__) = &(dev_priv__)->ring[0]; (ring__) < &(dev_priv__)->ring[I915_NUM_RINGS]; (ring__)++) \
+               for_each_if (intel_engine_initialized((ring__)))

  enum hdmi_force_audio {
         HDMI_AUDIO_OFF_DVI = -2,        /* no aux data for HDMI-DVI converter */

(which is a net win over the current code, presumably solely due to
eliminating dead locals)
-Chris

Well actually it isn't.

First off, that doesn't even compile, it leaves lots of "variable used uninitialised" type errors in users of the macro, because those users actually require the index, which is why my patches converted only those that didn't.

With the above (with the addition of "i__ = 0" ... "(i__)++") used as the definition of for_each_ring(), and also (but without the parameter "i__") as the definition of for_each_engine(), plus my second patch that converts as many as possible, we find the following:

-rw-r--r-- 30866475 Jan 4 14:55 i915-drm.ko	# baseline
-rw-r--r-- 30871447 Jan 4 14:56 i915-cw.ko	# this version
-rw-r--r-- 30877524 Jan 4 14:56 i915-dsg.ko	# my version

so it looks like this increases the size, and mine increases it even more. But actually, when we look at CODE size rather than file size:

   text	   data	    bss	    dec	    hex	filename
1092416   24233     512 1117161  110be9 i915-drm.ko	# baseline
1092820	  24233	    512	1117565	 110d7d	i915-cw.ko	# this version
1092244	  24233	    512	1116989	 110b3d	i915-dsg.ko	# my version

we find just the opposite. This makes the code ~400b larger, whereas my patch made it ~200b smaller. I think it's because GCC knows how to alias variables with different scopes to the same registers, so caching intermediate results in very-local-variables is generally a win. It's definitely not "kernel bloat".

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux