On Fri, Aug 30, 2024 at 01:40:19PM -0300, Jason Gunthorpe wrote: > On Fri, Aug 30, 2024 at 03:12:54PM +0000, Mostafa Saleh wrote: > > > + /* > > > + * If for some reason the HW does not support DMA coherency then using > > > + * S2FWB won't work. This will also disable nesting support. > > > + */ > > > + if (FIELD_GET(IDR3_FWB, reg) && > > > + (smmu->features & ARM_SMMU_FEAT_COHERENCY)) > > > + smmu->features |= ARM_SMMU_FEAT_S2FWB; > > I think that’s for the SMMU coherency which in theory is not related to the > > master which FWB overrides, so this check is not correct. > > Yes, I agree, in theory. > > However the driver today already links them together: > > case IOMMU_CAP_CACHE_COHERENCY: > /* Assume that a coherent TCU implies coherent TBUs */ > return master->smmu->features & ARM_SMMU_FEAT_COHERENCY; > > So this hunk was a continuation of that design. > > > What I meant in the previous thread that we should set FWB only for coherent > > masters as (in attach s2): > > if (smmu->features & ARM_SMMU_FEAT_S2FWB && dev_is_dma_coherent(master->dev) > > // set S2FWB in STE > > I think as I explained in that thread, it is not really correct > either. There is no reason to block using S2FWB for non-coherent > masters that are not used with VFIO. The page table will still place > the correct memattr according to the IOMMU_CACHE flag, S2FWB just > slightly changes the encoding. It’s not just the encoding that changes, as - Without FWB, stage-2 combine attributes - While with FWB, it overrides them. So a cacheable mapping in stage-2 can lead to a non-cacheable (or with different cachableitiy attributes) transaction based on the input. I am not sure though if there is such case in the kernel. Also, that logic doesn't only apply to VFIO, but also for stage-2 only SMMUs that use stage-2 for kernel DMA. > > For VFIO, non-coherent masters need to be blocked from VFIO entirely > and should never get even be allowed to get here. > > If anything should be changed then it would be the above > IOMMU_CAP_CACHE_COHERENCY test, and I don't know if > dev_is_dma_coherent() would be correct there, or if it should do some > ACPI inspection or what. I agree, I believe that this assumption is not accurate, I am not sure what is the right approach here, but in concept I think we shouldn’t enable FWB for non-coherent devices (using dev_is_dma_coherent() or other check) Thanks, Mostafa > > So let's drop the above hunk, it already happens implicitly because > VFIO checks it via IOMMU_CAP_CACHE_COHERENCY and it makes more sense > to put the assumption in one place. > > Thanks, > Jason