On 15/03/2019 18:40, Lucas De Marchi wrote:
On Fri, Mar 15, 2019 at 05:31:05PM +0000, Tvrtko Ursulin wrote:
On 15/03/2019 17:12, Lucas De Marchi wrote:
On Fri, Mar 15, 2019 at 12:26:33PM +0000, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Concept of a sub-platform already exist in our code (like ULX and ULT
platform variants and similar),implemented via the macros which check a
list of device ids to determine a match.
With this patch we consolidate device ids checking into a single
function
called during early driver load.
A few low bits in the platform mask are reserved for sub-platform
identification and defined as a per-platform namespace.
At the same time it future proofs the platform_mask handling by
preparing
the code for easy extending, and tidies the very verbose WARN strings
generated when IS_PLATFORM macros are embedded into a WARN type
statements.
The approach is also beneficial to driver size, with an combined
shrink of
code and strings of around 1.7 kiB.
v2: Fixed IS_SUBPLATFORM. Updated commit msg.
v3: Chris was right, there is an ordering problem.
v4:
* Catch-up with new sub-platforms.
* Rebase for RUNTIME_INFO.
* Drop subplatform mask union tricks and convert platform_mask to an
array for extensibility.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Suggested-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Jani Nikula <jani.nikula@xxxxxxxxx>
Cc: Lucas De Marchi <lucas.demarchi@xxxxxxxxx>
Cc: Jose Souza <jose.souza@xxxxxxxxx>
---
drivers/gpu/drm/i915/i915_drv.c | 7 +-
drivers/gpu/drm/i915/i915_drv.h | 110 +++++++++++++++--------
drivers/gpu/drm/i915/i915_pci.c | 2 +-
drivers/gpu/drm/i915/intel_device_info.c | 79 ++++++++++++++++
drivers/gpu/drm/i915/intel_device_info.h | 28 +++++-
5 files changed, 179 insertions(+), 47 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index 0d743907e7bc..3218350cd225 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -863,6 +863,8 @@ static int i915_driver_init_early(struct
drm_i915_private *dev_priv)
if (i915_inject_load_failure())
return -ENODEV;
+ intel_device_info_subplatform_init(dev_priv);
+
spin_lock_init(&dev_priv->irq_lock);
spin_lock_init(&dev_priv->gpu_error.lock);
mutex_init(&dev_priv->backlight_lock);
@@ -1752,10 +1754,11 @@ static void i915_welcome_messages(struct
drm_i915_private *dev_priv)
if (drm_debug & DRM_UT_DRIVER) {
struct drm_printer p = drm_debug_printer("i915 device info:");
- drm_printf(&p, "pciid=0x%04x rev=0x%02x platform=%s gen=%i\n",
+ drm_printf(&p, "pciid=0x%04x rev=0x%02x platform=%s (%x)
gen=%i\n",
INTEL_DEVID(dev_priv),
INTEL_REVID(dev_priv),
intel_platform_name(INTEL_INFO(dev_priv)->platform),
+
RUNTIME_INFO(dev_priv)->platform_mask[INTEL_INFO(dev_priv)->platform
/ (BITS_PER_TYPE(RUNTIME_INFO(dev_priv)->platform_mask[0]) -
INTEL_SUBPLATFORM_BITS)],
bug here, INTEL_SUBPLATFORM_BITS should be outside of []. Bad things
will happen for platform=32 /o\
? [32 / (32 - 3)] = [1], for which there is a BUILD_BUG_ON with a
comment saying to increase size of array.
I think I missed a "(" and thought that would be (32 / 32) - 3
anyway, you are printing confusing information here since you only print
one u32.
Confusing how and what is the second u32? You mean when array gets
expanded in the future? Here the actual idea (see next version) is to
only print the subplatform bits.
INTEL_GEN(dev_priv));
intel_device_info_dump_flags(INTEL_INFO(dev_priv), &p);
@@ -1798,8 +1801,6 @@ i915_driver_create(struct pci_dev *pdev, const
struct pci_device_id *ent)
memcpy(device_info, match_info, sizeof(*device_info));
RUNTIME_INFO(i915)->device_id = pdev->device;
- BUILD_BUG_ON(INTEL_MAX_PLATFORMS >
- BITS_PER_TYPE(device_info->platform_mask));
BUG_ON(device_info->gen > BITS_PER_TYPE(device_info->gen_mask));
return i915;
diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index dccb6006aabf..34282cf66cb0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2281,7 +2281,46 @@ static inline unsigned int
i915_sg_segment_size(void)
#define IS_REVID(p, since, until) \
(INTEL_REVID(p) >= (since) && INTEL_REVID(p) <= (until))
-#define IS_PLATFORM(dev_priv, p)
(INTEL_INFO(dev_priv)->platform_mask & BIT(p))
+#define __IS_PLATFORM(dev_priv, p) \
+({ \
+ const unsigned int pbits__ = \
+ BITS_PER_TYPE(RUNTIME_INFO(dev_priv)->platform_mask[0]) - \
+ INTEL_SUBPLATFORM_BITS; \
+ const unsigned int pi__ = (p) / pbits__; \
+ const unsigned int pb__ = (p) % pbits__ +
INTEL_SUBPLATFORM_BITS; \
+\
+ BUILD_BUG_ON(!__builtin_constant_p(p)); \
+\
+ (RUNTIME_INFO(dev_priv)->platform_mask[pi__] & BIT(pb__)); \
Ugh. That double dword fiddling is way too ugly. IMO it is not buying us
anything. Just use a u64 rather than the double dword. Your approach may
have a small benefit on ARCH=i386, but has the burden of carrying all
this forward. The diff below (only build-tested) is on top of yours,
which is basically equivalent to "move to u64 and then add the
subplatform part".
text data bss dec hex filename
1834620 40454 4176 1879250 1cacd2
drivers/gpu/drm/i915/i915.o.yours
1834710 40454 4176 1879340 1cad2c drivers/gpu/drm/i915/i915.o
The cost of going u64 would be higher than what you saw if bits above
were actually used I think. But would have to check the output to be
sure. It was at least a year ago I think I last played with this.
I challenge that. My butt feeling here is pretty strong that's not
true.
I hope you meant gut feeling. Anyways, I said I wasn't 100% sure of the
details, right? Having exercised my memory lanes I think the biggest
negative effect of the move to u64 was in conjunction with this
subplatform idea. This is due the "if (mask & platform) && (mask &
subplatform)" expanding to 2 u64 constants per check. I have a remedy
for that, but lets not get bogged down and code size which is not the
main consideration here. Idea was to consolidate and simplify whilst
preserving the two beneficial effects which I already explain.
Benefit of the u32 array approach is that it avoids that even on
64-bit builds.
my point is on 64-bit builds... It may suffer a little on m32.
Which is why I hinted the u32 array approach does not suffer on either.
No increase on 64-bit, no real increase on 32-bit.
Regards,
Tvrtko
Lucas De Marchi
As it stands v5 of my patch has minimal positive effect on code size
(sub 1k). Maybe a bit better in non-debug builds. But the main point
is about the devid checking consolidation.
It is of course open to discussion.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx