On Wednesday, May 25, 2016 6:49:18 PM CEST Sricharan wrote: > Hi Arnd, > > >> Ok, so i was doing this from the idea that, other iommu drivers > >> where polling on a status bit in their sync call to ensure completion > >> of pending TLB invalidations. But in this case, there is no status bit. > >> So added a barrier to have no ordering issues before the client > >> triggers the dma operation. But as you say above that it is implicit that > >> the device would have a barrier before starting the trigger, then the > >> barrier here becomes redundant. > > > >Ok. There are two more things to note here: > > > >* On other platforms, polling the register is likely required because > > an MMIO write is "posted", meaning that a sync after writel() will > > only ensure that it has left the CPU write queue, but it may still be > > on the bus fabric and whatever side-effects are triggered by the > > write are normally not guaranteed to be completed even after the > > 'sync'. You need to check the datasheet for your IOMMU to find out > > whether the 'dsb' instruction actually has any effect on the IOMMU. > > If not, then neither the barrier that you add here nor the barrier > > in the following writel() is sufficient. > > > Thanks for the detailed explanation. > i will check this. So with this, i think that if the iommu does not > support polling for its status, then it should listen to 'dsb' otherwise > there will no be no way of make it coherent ? It's more likely that you have to do a readl() after the writel() to any address inside of the device in order to ensure the writel() has completed, but that is something that the hardware manual for the iommu should describe. > >* The one barrier that is normally required in an IOMMU is between > > updating the in-memory page tables and the following MMIO store > > that triggers the TLB flush for that entry. This barrier is > > implied by writel() but not writel_relaxed(). If you don't have > > a hardware page table walker in your IOMMU, you don't need to worry > > about this. > > > To get my understanding correct here, is the barrier required here because > of speculative fetch ? It doesn't have to be a speculative fetch, it could also be a normal fetch triggered by a DMA. The point is that the CPU is free to reorder the store to (io page table) memory relative to the store to the IOMMU flush register, and when you directly follow up with a store to the DMA master that triggers a transfer, this transfer can hit the IOMMU and cause the stale page table entry to be read. I guess in practice the barrier that comes before the MMIO write to the DMA master would ensure ordering of the DMA after both the IOPTE update and the MMIO flush (unless there was the speculative fetch you mentioned), but this is where I'm no longer confident enough in my pwn understanding of th ordering rules to rely on that. Having the barrier between the IOPTE update and the flush certainly leaves no room for ambiguity. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html