On Mon, Nov 07, 2016 at 07:19:24AM -0500, Rob Clark wrote: > On Mon, Nov 7, 2016 at 3:35 AM, Archit Taneja <architt@xxxxxxxxxxxxxx> wrote: > > > > > > On 11/06/2016 07:45 PM, Rob Clark wrote: > >> > >> On Fri, Nov 4, 2016 at 6:44 PM, Jordan Crouse <jcrouse@xxxxxxxxxxxxxx> > >> wrote: > >>> > >>> For reasons that are not entirely understood using dma_map_sg() > >>> for nocache/write combine buffers doesn't always successfully flush > >>> the cache after the memory is zeroed somewhere deep in the bowels > >>> of the shmem code. My working theory is that the cache flush on > >>> the swiotlb bounce buffer address work isn't always flushing what > >>> we need. > >>> > >>> Instead of using dma_map_sg() directly kmap and flush each page > >>> at allocate time. We could use invalidate + clean or just invalidate > >>> if we wanted to but on ARM64 using a flush is safer and not much > >>> slower for what we are trying to do. > >>> > >>> Hopefully someday I'll more clearly understand the relationship between > >>> shmem kmap, vmap and the swiotlb bounce buffer and we can be smarter > >>> about when and how we invalidate the caches. > >> > >> > >> Like I mentioned on irc, we defn don't want to ever hit bounce > >> buffers. I think the problem here is dma-mapping assumes we only > >> support 32b physical addresses, which is not true (at least as long as > >> we have iommu).. > >> > >> Archit hit a similar problem on the display side of things. > > > > > > Yeah, the shmem allocated pages ended up being 33 bit addresses some times > > on db820c. The msm driver sets the dma mask to a default of 32 bits. > > The dma mapping api gets unhappy whenever we get sg chunks with 33 bit > > addresses, and tries to use switolb for them. We eventually end up > > overflowing the swiotlb. > > > > Setting the mask to 33 bits worked as a temporary hack. > > actually I think setting the mask is the correct thing to do, but > probably should use dma_set_ fxn.. and I suspect that the PA for the > iommu is larger than 33 bits. It is probably equal to the number of > address lines that are wired up. Not quite sure what that is, but I > don't think there are any devices where the iommu cannot map some > physical pages. The GPU/MMU combo supporting 48 bit PAs since at least the 4XX era. I'm not sure if a 48 bit mask would break the older targets or not - you won't be getting any addressable memory > 32 bits on those devices anyway. I'll try the dma_set_mask trick with 48 bits and see how it pans out. Jordan -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html