Re: [Intel-gfx] [PATCH v2 4/5] drm/i915/guc: Add GuC CT specific debug print wrappers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/1/2022 04:01, Tvrtko Ursulin wrote:
On 01/12/2022 11:56, Michal Wajdeczko wrote:
On 01.12.2022 01:41, John Harrison wrote:
On 11/23/2022 12:45, Michal Wajdeczko wrote:
On 23.11.2022 02:25, John Harrison wrote:
On 11/22/2022 09:54, Michal Wajdeczko wrote:
On 18.11.2022 02:58, John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx>

Re-work the existing GuC CT printers and extend as required to match
the new wrapping scheme.


[snip]...

      /**
     * DOC: CTB Blob
@@ -170,7 +188,7 @@ static int ct_control_enable(struct intel_guc_ct
*ct, bool enable)
        err = guc_action_control_ctb(ct_to_guc(ct), enable ?
                         GUC_CTB_CONTROL_ENABLE :
GUC_CTB_CONTROL_DISABLE);
        if (unlikely(err))
-        CT_PROBE_ERROR(ct, "Failed to control/%s CTB (%pe)\n",
+        ct_probe_error(ct, "Failed to control/%s CTB (%pe)\n",
                       str_enable_disable(enable), ERR_PTR(err));
btw, shouldn't we change all messages to start with lowercase ?

was:
      "CT0: Failed to control/%s CTB (%pe)"
is:
      "GT0: GuC CT Failed to control/%s CTB (%pe)"

unless we keep colon (as suggested by Tvrtko) as then:

      "GT0: GuC CT: Failed to control/%s CTB (%pe)"
Blanket added the colon makes it messy when a string actually wants to start with the prefix. The rule I've been using is lower case word when
the prefix was part of the string, upper case word when the prefix is
Hmm, I'm not sure that we should attempt to have such a flexible rule as we shouldn't rely too much on actual format of the prefix as it could be changed any time.  All we should know about final log message is that it _will_ properly identify the "GT" or "GuC" that this log is related to.

So I would suggest to be just consistent and probably always start with
upper case, as that seems to be mostly used in kernel error logs, and
just make sure that any prefix will honor that (by including colon, or
braces), so this will always work like:

"[drm] *ERROR* GT0: Failed to foo (-EIO)"
"[drm] *ERROR* GT0: GUC: Failed to foo (-EIO)"
"[drm] *ERROR* GT0: GUC: CT: Failed to foo (-EIO)"

or

"[drm] *ERROR* GT0: Failed to foo (-EIO)"
"[drm] *ERROR* GT0: [GUC] Failed to foo (-EIO)"
"[drm] *ERROR* GT0: [GUC] CT: Failed to foo (-EIO)"

and even for:

"[drm] *ERROR* GT(root) Failed to foo (-EIO)"
"[drm] *ERROR* GuC(media) Failed to foo (-EIO)"
"[drm] *ERROR* GT0 [GuC:CT] Failed to foo (-EIO)"
All of which are hideous/complex/verbose/inconsistent. 'GT0: GUC: CT:'?
Really? Or 'GT0: [GUC] CT:'? Why the random mix of separators? And how
would you implement '[GUC:CT]' without having a CT definition that is
entirely self contained and does chain on to the GuC level version?

you missed the point, as those were just examples of different possible
prefixes that one could define, to show that actual message shall not
make any assumption how such prefix will look like or how it will end
(like with or w/o colon, with "GuC" or "GT" tag or whatever)
The point is that none of those are ever likely to happen so are meaningless to prepare for.



This is pointless bikeshedding. If you want to re-write every single
debug print (yet again) and invent much more complicated macro

the opposite, I want clear understanding how messages should be written
to *avoid* rewriting them if (for some reason) we decide to change or
update the prefix in the future
You say that like there is any consistency or regulation at all on how messages are currently written.


definitions then feel free to take over the patch set. If not can we
just approve the v3 version and move on to doing some actual work?

if everyone is happy that there is inconsistency in use between gt_xxx
messages where we shall be using messages starting with upper case
(since prefix ends with colon) and guc/ct_xxx messages where we shall be
using lower case message (since there is a known prefix without colon,
either "GuC" or "CT") then I'll be also fine, but for now that bothers
me a little, hence asking for clarifications/agreement

I don't think anyone is happy with anything.

Personally, I don't like the idea of adding fixed prefixes for every subsystem just for the sake of doing so. Having a wrapper that abstracts out the messy structure mangling of getting from a 'ct' object back to a 'gt' object maybe has use. But with regards to fixed prefixes, all I ever wanted was to add GT# (because that adds useful information that is other completely lacking) and, ideally, a display prefix for all of the display prints (because they completely swamp all other output in CI dmesg logs, so being able to trivially filter them out would be incredibly useful). Beyond that, it seems like adding work and forced formatting just for the sake of it. If a print wants to say 'Error received on CT read channel' then why should it be forced to be 'GuC CT Error received on read channel' which is coded as 'Error received on read channel' ? To me, that seems less clear both in dmesg and in code.

And for the record, I didn't want to add the GT thing as message 'prefix' either. I would rather have just extended the PCI address that is printed by the dev level printer. But that would have been a much more invasive change at many levels outside of i915. As in, "i915 0000:00:02.0: [drm]" -> "i915 0000:00:02.0:GT0 [drm]".

And here we are a month down the road and arguing over whether a three level deep prefix should have a colon or braces around each level's forced message prefix!? Meanwhile, it is still impossible to know which GT produced a given error in a multi-GT system and the ability to filter out display messages has been completely nacked for reasons I don't understand at all before I even got to suggest it.

and while for dbg level messages it doesn't matter, I assume we should
be consistent for err/warn/info messages (as those will eventually show
up to the end user) so let maintainers decide here what is expectation here

Could we have some examples pasted here, of the end result of this series, for all message "categories" (origins, macros, whatever)?

GT initialisation:
gt_err(gt, "Failed to allocate scratch page\n");
i915 0000:00:02.0: [drm] GT0: Failed to allocate scratch page

G2H notification handler:
guc_err(guc, "notification: Invalid length %u for deregister done\n", len);
i915 0000:00:02.0: [drm] GT0: GuC notification: Invalid length 0 for deregister done

CTB initialisation:
ct_probe_error(ct, "Failed to control/%s CTB (%pe)\n", str_enable_disable(enable), ERR_PTR(err));
i915 0000:00:02.0: [drm] GT0: GuC CT Failed to control/enable CTB (EINVAL)

Random meaningless (to me) message that is apparently a display thing:
drm_dbg_kms(&dev_priv->drm, "disabling %s\n", pll->info->name);
i915 0000:00:02.0: [drm:intel_disable_shared_dpll [i915]] disabling PORT PLL B

I'm sure you can extrapolate to all other forms of dbg, notice, info, etc. without me having to manually type each one out, given that they are all identical.

Personally, I think the above should be just:
gt_err(gt, "Failed to allocate scratch page\n");
i915 0000:00:02.0: [drm] GT0: Failed to allocate scratch page

gt_err(guc_to_gt(guc), "G2H: Invalid length for deregister done: %u\n", len);
i915 0000:00:02.0: [drm] GT0: G2H: Invalid length for deregister done: 0

gt_probe_error(ct_to_gt(ct), "Failed to %s CT %d buffer (%pe)\n", str_enable_disable(enable), send ? "SEND" : "RECV", ERR_PTR(err));
i915 0000:00:02.0: [drm] GT0: Failed to enable CT SEND buffer (EINVAL)

drm_dbg_kms(&dev_priv->drm, "disabling %s\n", pll->info->name);
i915 0000:00:02.0: [drm:intel_disable_shared_dpll [i915-KMS]] disabling PORT PLL B But presumably that requires finishing the plan of splitting out the display code into a separate driver. So for now, something like this would still be a massive improvement:
kms_dbg(dev_priv, "disabling %s\n", pll->info->name);
i915 0000:00:02.0: [drm:intel_disable_shared_dpll [i915]] KMS: disabling PORT PLL B

John.


Regards,

Tvrtko




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux