On 1/3/2025 12:02 AM, Rob Clark wrote: > From: Rob Clark <robdclark@xxxxxxxxxxxx> > > On mmu-500, stall-on-fault seems to stall all context banks, causing the > GMU to misbehave. So limit this feature to smmu-v2 for now. > > This fixes an issue with an older mesa bug taking outo the system > because of GMU going off into the weeds. > > What we _think_ is happening is that, if the GPU generates 1000's of > faults at ~once (which is something that GPUs can be good at), it can > result in a sufficient number of stalled translations preventing other > transactions from entering the same TBU. > > Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx> Reviewed-by: Akhil P Oommen <quic_akhilpo@xxxxxxxxxxx> -Akhil > --- > v2: Adds a modparam to override the default behavior, for debugging > GPU faults in cases which do not (or might not) cause lockup. > Also, rebased to not depend on Bibek's PRR support. > > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 19 +++++++++++++++++-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > index 6372f3e25c4b..3239bbf18514 100644 > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > @@ -16,6 +16,10 @@ > > #define QCOM_DUMMY_VAL -1 > > +static int enable_stall = -1; > +MODULE_PARM_DESC(enable_stall, "Enable stall on iova fault (1=on , 0=disable, -1=auto (default))"); > +module_param(enable_stall, int, 0600); > + > static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) > { > return container_of(smmu, struct qcom_smmu, smmu); > @@ -210,7 +214,9 @@ static bool qcom_adreno_can_do_ttbr1(struct arm_smmu_device *smmu) > static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, > struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) > { > + const struct device_node *np = smmu_domain->smmu->dev->of_node; > struct adreno_smmu_priv *priv; > + bool stall_enabled; > > smmu_domain->cfg.flush_walk_prefer_tlbiasid = true; > > @@ -237,8 +243,17 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, > priv->get_ttbr1_cfg = qcom_adreno_smmu_get_ttbr1_cfg; > priv->set_ttbr0_cfg = qcom_adreno_smmu_set_ttbr0_cfg; > priv->get_fault_info = qcom_adreno_smmu_get_fault_info; > - priv->set_stall = qcom_adreno_smmu_set_stall; > - priv->resume_translation = qcom_adreno_smmu_resume_translation; > + > + if (enable_stall < 0) { > + stall_enabled = of_device_is_compatible(np, "qcom,smmu-v2"); > + } else { > + stall_enabled = !!enable_stall; > + } > + > + if (stall_enabled) { > + priv->set_stall = qcom_adreno_smmu_set_stall; > + priv->resume_translation = qcom_adreno_smmu_resume_translation; > + } > > return 0; > }