Re: [Intel-gfx] [PATCH] drm: Reduce EDID warnings from DRM_ERROR to DRM_NOTE

Daniel Vetter <daniel@xxxxxxxx> · Tue, 14 Feb 2017 23:23:11 +0100



On Tue, Feb 14, 2017 at 09:43:45PM +0000, Chris Wilson wrote:
> On Tue, Feb 14, 2017 at 10:36:09PM +0100, Daniel Vetter wrote:
> > On Mon, Feb 13, 2017 at 12:17:27PM -0500, Sean Paul wrote:
> > > On Mon, Feb 13, 2017 at 3:59 AM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote:
> > > > On Mon, Feb 13, 2017 at 08:41:10AM +0100, Thierry Reding wrote:
> > > >> On Fri, Feb 10, 2017 at 07:59:13PM +0000, Chris Wilson wrote:
> > > >> > The warnings from parsing the EDID are not driver errors, but the
> > > >> > "normal but significant" conditions from the external device. As such,
> > > >> > they do not need the ferocity of an *ERROR*, but can use the less harsh
> > > >> > DRM_NOTE instead.
> > > >> >
> > > >> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > > >> > ---
> > > >> >  drivers/gpu/drm/drm_edid.c | 15 ++++++++-------
> > > >> >  1 file changed, 8 insertions(+), 7 deletions(-)
> > > >>
> > > >> The below are all conditions that happen when the EDID is bad. I'm not
> > > >> sure that really qualifies as "normal".
> > > >
> > > > Often it is - a bad EDID on the monitor will always be bad. The
> > > > challenge is distinguishing that from silent data corruption during the
> > > > read - a reported read failure are trivial.
> > > >
> > > >> From a quick look through the code we don't always trigger an error from
> > > >> the below failure paths at higher levels, so decreasing the level here
> > > >> has the potential to let this kind of exceptional condition go
> > > >> unnoticed.
> > > >
> > > > The messages are not gone, they are higher than the default loglevel,
> > > > but now below the level at which they are printed to a terminal. The
> > > > bad EDID is either expected or recoverable, and definitely not fatal
> > > > so I don't think an *ERROR* is justified.
> > > 
> > > I tend to agree.
> > > 
> > > The description for the KERN_NOTICE level is "normal but significant
> > > condition". I might argue that the presence of these EDID messages
> > > represents a normal *or* significant condition (depending on why the
> > > EDID is bad), but I don't think it's unreasonable to expect people to
> > > check their logs if the display/mode is not working properly.
> > 
> > So for cases where we know that there is shit hw out there (specifically
> > kvm switches that mangle the cea block without adjusting the edid) we
> > already tune down the error to debug level. So in principle totally agree
> > with tuning down anything that happens because it's outside of our control
> > to info or debug, but do we still need this patch after the cea one has
> > landed? Our CI at least seems happy ...
> 
> Yes. The one machine with a dodgy EDID also happens to have a dodgy
> BIOS. This reduces the number of consistent errors to 1, but since an
> unrelated error still remains, CI doesn't detect the improvement.
> https://intel-gfx-ci.01.org/CI/CI_DRM_2198/fi-skl-6700k/igt@drv_module_reload@xxxxxxxxxxxxxxxxx

Ok, count my convinced, I pushed the patch to drm-misc-next.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel