On 11/02/2022 20:27, alyssa.rosenzweig@xxxxxxxxxxxxx wrote: > From: Alyssa Rosenzweig <alyssa.rosenzweig@xxxxxxxxxxxxx> > > Some Valhall GPUs require resets when encountering bus faults due to > occlusion query writes. Add the issue bit for this and handle it. > > Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@xxxxxxxxxxxxx> Reviewed-by: Steven Price <steven.price@xxxxxxx> (although one nit below) Just in case any one is wondering - these bus faults occur when switching the GPU's MMU to unmapped - it's not a normal "bus fault" from the external bus. This is triggered by an attempt to read unmapped memory which is completed by the driver by switching the entire MMU to unmapped. > --- > drivers/gpu/drm/panfrost/panfrost_device.c | 9 +++++++-- > drivers/gpu/drm/panfrost/panfrost_issues.h | 4 ++++ > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c > index 7f51a4682ccb..ee612303f076 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_device.c > +++ b/drivers/gpu/drm/panfrost/panfrost_device.c > @@ -11,6 +11,7 @@ > #include "panfrost_device.h" > #include "panfrost_devfreq.h" > #include "panfrost_features.h" > +#include "panfrost_issues.h" > #include "panfrost_gpu.h" > #include "panfrost_job.h" > #include "panfrost_mmu.h" > @@ -380,9 +381,13 @@ const char *panfrost_exception_name(u32 exception_code) > bool panfrost_exception_needs_reset(const struct panfrost_device *pfdev, > u32 exception_code) > { > - /* Right now, none of the GPU we support need a reset, but this > - * might change. > + /* If an occlusion query write causes a bus fault on affected GPUs, > + * future fragment jobs may hang. Reset to workaround. > */ > + if (exception_code == DRM_PANFROST_EXCEPTION_JOB_BUS_FAULT) > + return panfrost_has_hw_issue(pfdev, HW_ISSUE_TTRX_3076); > + > + /* No other GPUs we support need a reset */ > return false; > } > > diff --git a/drivers/gpu/drm/panfrost/panfrost_issues.h b/drivers/gpu/drm/panfrost/panfrost_issues.h > index a66692663833..058f6a4c8435 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_issues.h > +++ b/drivers/gpu/drm/panfrost/panfrost_issues.h > @@ -128,6 +128,10 @@ enum panfrost_hw_issue { > /* Must set SC_VAR_ALGORITHM */ > HW_ISSUE_TTRX_2968_TTRX_3162, > > + /* Bus fault from occlusion query write may cause future fragment jobs > + * to hang */ NIT: Kernel comment style has the "/*" and "*/" on lines by themselves for multi-line comments. checkpatch will complain! > + HW_ISSUE_TTRX_3076, > + > HW_ISSUE_END > }; >