Re: [PATCH 1/2] drm/i915/guc: Add error-capture init warnings when needed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 17/10/2022 18:46, Teres Alexis, Alan Previn wrote:
On Mon, 2022-10-17 at 09:42 +0100, Tvrtko Ursulin wrote:
On 15/10/2022 04:59, Alan Previn wrote:
If GuC is being used and we initialized GuC-error-capture,
we need to be warning if we don't provide an error-capture
register list in the firmware ADS, for valid GT engines.
A warning makes sense as this would impact debugability
without realizing why a reglist wasn't retrieved and reported
by GuC.>
However, depending on the platform, we might have certain
engines that have a register list for engine instance error state
but not for engine class. Thus, add a check only to warn if the
register list was non existent vs an empty list (use the
empty lists to skip the warning).

Signed-off-by: Alan Previn <alan.previn.teres.alexis@xxxxxxxxx>
---
   .../gpu/drm/i915/gt/uc/intel_guc_capture.c    | 55 ++++++++++++++++++-
   1 file changed, 53 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
index 8f1165146013..290c1e1343dd 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
@@ -419,6 +419,44 @@ guc_capture_get_device_reglist(struct intel_guc *guc)
   	return default_lists;
   }
+static const char *
+__stringify_type(u32 type)
+{
+	switch (type) {
+	case GUC_CAPTURE_LIST_TYPE_GLOBAL:
+		return "Global";
+	case GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS:
+		return "Class";
+	case GUC_CAPTURE_LIST_TYPE_ENGINE_INSTANCE:
+		return "Instance";
+	default:
+		return "unknown";
+	}
+
+	return "";

What's the point of these returns? Gcc complains about not returning a type from const char * return function?

Sorry i am not sure I saw any complain for Gcc. If you are referring to "" then i can re-rev without the default and
just let the path outside return the unknown. Is that what you are referring to?

Yes, it is an unreachable path, handled by default switch branch already.

+}
+
+static const char *
+__stringify_engclass(u32 class)
+{
+	switch (class) {
+	case GUC_RENDER_CLASS:
+		return "Render";
+	case GUC_VIDEO_CLASS:
+		return "Video";
+	case GUC_VIDEOENHANCE_CLASS:
+		return "VideoEnhance";
+	case GUC_BLITTER_CLASS:
+		return "Blitter";
+	case GUC_COMPUTE_CLASS:
+		return "Compute";
+	default:
+		return "unknown";
+	}
+
+	return "";
+}
+
   static int
   guc_capture_list_init(struct intel_guc *guc, u32 owner, u32 type, u32 classid,
   		      struct guc_mmio_reg *ptr, u16 num_entries)
@@ -487,19 +525,32 @@ intel_guc_capture_getlistsize(struct intel_guc *guc, u32 owner, u32 type, u32 cl
   			      size_t *size)
   {
   	struct intel_guc_state_capture *gc = guc->capture;
+	struct drm_i915_private *i915 = guc_to_gt(guc)->i915;
   	struct __guc_capture_ads_cache *cache = &gc->ads_cache[owner][type][classid];
   	int num_regs;
- if (!gc->reglists)
+	if (!gc->reglists) {
+		drm_warn(&i915->drm, "GuC-capture: No reglist on this device\n");
   		return -ENODEV;
+	}
if (cache->is_valid) {
   		*size = cache->size;
   		return cache->status;
   	}
+ if (!guc_capture_get_one_list(gc->reglists, owner, type, classid)) {
+		if (owner == GUC_CAPTURE_LIST_INDEX_PF && type == GUC_CAPTURE_LIST_TYPE_GLOBAL)
+			drm_warn(&i915->drm, "GuC-capture: missing reglist type-Global\n");
+		if (owner == GUC_CAPTURE_LIST_INDEX_PF)

GUC_CAPTURE_LIST_INDEX_PF could be made once on the enclosing if statement?
Sure - will do.

Btw what's with the PF and VF (cover letter) references while SRIOV does not exists upstream?
To maintain a scalable code flow across both the ADS code and guc-error-capture code, we do have to skip over this enum
else we'll encounter lots of warnings about missing VF-reglist support (which we cant check for since we dont even have
support - i.e we dont even have a "is not supported" check) but the GuC firmware ADS buffer allocation includes an entry
for VFs so we have to skip over it. This is the cleanest way i can think of without impacting other code areas and also
while adding the ability to warn when its important.
+			drm_warn(&i915->drm, "GuC-capture: missing regiist type(%d)-%s : "

reglist
thanks - will fix

+				 "%s(%d)-Engine\n", type, __stringify_type(type),
+				 __stringify_engclass(classid), classid);

One details to consider from Documentation/process/coding-style.rst
"""
However, never break user-visible strings such as printk messages because that breaks the ability to grep for them.
"""

I totally agree with you but i cant find a way to keep totally grep-able way without creating a whole set of error
strings for the various list-types, list-owners and class-types. However i did ensure the first part of the message
is grep-able : "GuC-capture: missing reglist type". Do you an alternative proposal?

Yeah it is not very greppable being largely constructed at runtime, but just don't break the string. IMO no gain to diverge from coding style here.

Or maybe with one level of indentation less as discussed, and maybe remove "GuC-capture" if it can be implied (are there other reglists not relating to error capture?), maybe it becomes short enough?

"Missing GuC reglist %s(%u):%s(%u)!", ...

?

Type will never be unknown I suspect since it should always be added very early during development. So type and engine suffixes may be redundant? Or keep it verbose if that fits better with existing GuC error capture logging, I don't know.


Also commit message you can aim to wrap at 75 chars as per submitting-patches.rst.

+		return -ENODATA;

Is this a new exit condition or the thing would exit on the !num_regs check below anyway? Just wondering if the series is only about logging changes or there is more to it.
Its no different from previous behavior - and yes its about logging the missing reg lists for hw that needs it as we are
missing this for DG2 we we didn't notice (we did a previous revert to remove these warnings but that time the warnings
was too verbose - even complaining for the intentional empty lists and for VF cases that isnt supported).

Okay think I get it, thanks. If the "positive match" logging of empty list is more future proof than "negative - don't log these" you will know better.

Regards,

Tvrtko


+	}
+
   	num_regs = guc_cap_list_num_regs(gc, owner, type, classid);
-	if (!num_regs)
+	if (!num_regs) /* intentional empty lists can exist depending on hw config */
   		return -ENODATA;
*size = PAGE_ALIGN((sizeof(struct guc_debug_capture_list)) +

Regards,

Tvrtko




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux