Re: [PATCH] drm/i915: Include GuC fw version in error state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2/24/2017 4:19 PM, Chris Wilson wrote:
On Fri, Feb 24, 2017 at 11:43:32AM +0100, Michal Wajdeczko wrote:
On Fri, Feb 24, 2017 at 09:13:29AM +0000, Chris Wilson wrote:
On Fri, Feb 24, 2017 at 09:13:05AM +0530, Kamble, Sagar A wrote:
    Reviewed-by: Sagar Arun Kamble [1]<sagar.a.kamble@xxxxxxxxx>

    On 2/24/2017 4:41 AM, Michel Thierry wrote:

  There was no way to check if the platform is running the latest firmware.

  Cc: Tvrtko Ursulin [2]<tvrtko.ursulin@xxxxxxxxx>
  Cc: Arkadiusz Hiler [3]<arkadiusz.hiler@xxxxxxxxx>
  Signed-off-by: Michel Thierry [4]<michel.thierry@xxxxxxxxx>
  ---
   drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++++
   1 file changed, 10 insertions(+)

  diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
  index 2b1d15668192..e022187916ee 100644
  --- a/drivers/gpu/drm/i915/i915_gpu_error.c
  +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
  @@ -632,6 +632,16 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
                             CSR_VERSION_MINOR(csr->version));
          }

  +       if (HAS_GUC_UCODE(dev_priv)) {
  +               struct intel_uc_fw *guc_fw = &dev_priv->guc.fw;
  +
  +               err_printf(m, "GuC loaded: %s\n",
  +                          yesno(guc_fw->load_status ==
  +                                INTEL_UC_FIRMWARE_SUCCESS));
  +               err_printf(m, "GuC fw version: %d.%d\n",
  +                          guc_fw->major_ver_found, guc_fw->minor_ver_found);
  +       }
  +
Hmm. The firmware may change between the hang and cat
/sys/class/drm/card0/error (as it will be reloaded after the reset).
Btw, maybe we should add counter that will be incremented on each fw reload
and reported here ?
If it occurs to you that we need it for post-mortem debugging and having
it is worth more than any potential confusion....

I can see the need for knowing what guc/huc/dmc/etc was running at the
time of a hang - I just hope that what was previously running before an
earlier reset doesn't contribute. But that's why we focus on the first
error in a system...
-Chris

GT reset count is present already in error state. GuC kernel parameters are present and this
change will help us identify which firmware issue was encountered.
So I feel printing ver_found should be enough.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux