Re: [PATCH] drm/i915: Add option to list load failure checkpoints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Michal Wajdeczko (2018-02-01 14:47:05)
> On Wed, 31 Jan 2018 21:48:42 +0100, Chris Wilson  
> <chris@xxxxxxxxxxxxxxxxxx> wrote:
> 
> > Quoting Michal Wajdeczko (2018-01-31 18:23:47)
> >> Our inject_load_failure functionality allows to insert one
> >> failure during driver load, but it is hard to guess which
> >> number should passed as modparam to select specific checkpoint.
> >>
> >> Use negative number as option to list all available failure
> >> checkpoints without triggering any failure.
> >
> > Hmm, it was only intended for use with the coupled igt test. Mind
> > expanding upon the use case you have? Could you not use that iterative
> > search for finding the injection value you want for repeated runs? For
> > the bisect case, do you not want to keep it iterating over all in case
> > the value changes? How stable do you want the modparam?
> 
> Iterative approach is good for validation team to verify that all
> existing failure points are correctly handled (ie. we don't cause
> crash/panic), but it is less useful when you are interested in adding
> new checkpoints just for your code, as it requires both time and luck
> as you may be hit by earlier checkpoint that no longer works ;)
> 
> Btw, IMHO this modparam should only be exposed in DEBUG config (as it
> introduces some code and text), and maybe we should also consider
> extending it to support more than one failure (as then it will be
> easy to check several code paths (like failure in fallback)

Sure, let's get it under the IS_ENABLED(CONFIG_DRM_I915_DEBUG).
 
> This patch was trying to keep definition of modparam unchanged (extra
> -1 value should not be noticed by any existing use case) but since
> it is purely debug feature I'm not against making bigger changes here.

>From my pov, it's very useful to define how you expect it to work, and
how to make it more useful than drv_module_reload and to keep it working
as you intend. A pure fault-counter doesn't seem very robust or
intuitive for me when you are trying to develop a new fault point. Otoh,
I don't mind if the developer has to fix all the earlier fails when
testing his (those fails are all our responsibility) -- or at least file
bugs so that awareness is raised.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux