Quoting Lucas De Marchi (2020-05-21 01:37:52) > From: Fernando Pacheco <fernando.pacheco@xxxxxxxxx> > > The error detection and correction capability > for GRF and instruction cache (IC) will utilize > the new interrupt and error handling infrastructure > for dgfx products. The GFX device can generate > a number of classes of error under the new > infrastructure: correctable, non-fatal, and > fatal errors. > > The non-fatal and fatal error classes distinguish > between levels of severity for uncorrectable errors. > All ECC uncorrectable errors will be reported as > fatal to produce the desired system response. Fatal > errors are expected to route as PCIe error messages > which should result in OS issuing a GFX device FLR. > But the option exists to route fatal errors as > interrupts. > > Driver will only handle logging of errors. Anything > more will be handled at system level. > > For errors that will route as interrupts, three > bits in the Master Interrupt Register will be used > to convey the class of error. > > For each class of error: > 1. Determine source of error (IP block) by reading > the Device Error Source Register (RW1C) that > corresponds to the class of error being serviced. > 2. If the generating IP block is GT, read and log the > GT Error Register (RW1C) that corresponds to the > class of error being serviced. Non-GT errors will > be logged in aggregate for now. > > Bspec: 50875 > > Cc: Paulo Zanoni <paulo.r.zanoni@xxxxxxxxx> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> > Cc: Fernando Pacheco <fernando.pacheco@xxxxxxxxx> > Cc: Radhakrishna Sripada <radhakrishna.sripada@xxxxxxxxx> > Signed-off-by: Fernando Pacheco <fernando.pacheco@xxxxxxxxx> > Signed-off-by: Lucas De Marchi <lucas.demarchi@xxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_irq.c | 121 ++++++++++++++++++++++++++++++++ > drivers/gpu/drm/i915/i915_reg.h | 28 ++++++++ > 2 files changed, 149 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index ebc80e8b1599..17e679b910da 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -2515,6 +2515,124 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg) > return IRQ_HANDLED; > } > > +static const char * > +hardware_error_type_to_str(const enum hardware_error hw_err) > +{ > + switch (hw_err) { > + case HARDWARE_ERROR_CORRECTABLE: > + return "CORRECTABLE"; > + case HARDWARE_ERROR_NONFATAL: > + return "NONFATAL"; > + case HARDWARE_ERROR_FATAL: > + return "FATAL"; > + default: > + return "UNKNOWN"; > + } > +} > + > +static void > +gen12_gt_hw_error_handler(struct drm_i915_private * const i915, > + const enum hardware_error hw_err) > +{ > + void __iomem * const regs = i915->uncore.regs; > + const char *hw_err_str = hardware_error_type_to_str(hw_err); > + u32 other_errors = ~(EU_GRF_ERROR | EU_IC_ERROR); > + u32 errstat; > + > + lockdep_assert_held(&i915->irq_lock); Wrong place and wrong locks. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx