On Tue, May 20, 2014 at 12:08:58PM +0100, Arnd Bergmann wrote: > On Tuesday 20 May 2014 12:02:46 Catalin Marinas wrote: > > On Mon, May 19, 2014 at 05:55:56PM +0100, Arnd Bergmann wrote: > > > On Monday 19 May 2014 16:56:08 Catalin Marinas wrote: > > > > On Mon, May 19, 2014 at 10:44:51AM +0100, Arnd Bergmann wrote: > > > > > On Monday 19 May 2014 10:03:40 Catalin Marinas wrote: > > > We probably want to default to 32-bit for arm32 in the absence of dma-ranges. > > > For arm64, I'd prefer if we could always mandate dma-ranges to be present > > > for each bus, just like we mandate ranges to be present. > > > I hope it's not too late for that. > > > > > > dma_set_mask should definitely look at the dma-ranges properties, and the > > > helper that Santosh just introduced should give us all the information > > > we need. We just need to decide on the correct behavior. > > > > Last time I looked at Santosh's patches I thought the dma-ranges is per > > device rather than per bus. We could make it per bus only and let the > > device call dma_set_mask() explicitly if it wants to restrict it > > further. > > Can you check again? I've read the code again yesterday to check this, > and I concluded it was correctly doing this per bus. You are right, I missed the fact that of_dma_get_range() checks the dma-ranges property in the node's parent. So what we need is setting the default dma mask based on the size information in dma-ranges to something like this: mask = rounddown_pow_of_two(size) - 1; dma_set_mask(dev, mask); /* or dma_set_mask_and_coherent() */ > > > > > While we currently don't have a set of swiotlb DMA ops on ARM32, we do > > > > > have it on ARM64, and I think we should be using them properly. It should > > > > > really not be hard to implement a proper dma_set_mask() function for > > > > > ARM64 that gets is able to set up the swiotlb based on the dma-ranges > > > > > properties and always returns success but leaves the mask unchanged. > > > > > > > > The swiotlb bounce buffer needs to be pre-allocated at boot, otherwise > > > > we don't have any guarantees. Since we can't honour random masks anyway, > > > > we stick to ZONE_DMA which is currently in the 4G limit. But the driver > > > > calls dma_set_mask() too late for any further swiotlb setup. > > > > > > > > With IOMMU we can be more flexible around dma_set_mask(), can be done at > > > > run-time. > > > > > > What we can do with swiotlb is to check if the mask is smaller than ZONE_DMA. > > > If it ever is, we have to fail dma_set_mask and hope the driver can fall > > > back to PIO mode or it will have to fail its probe() function. > > > > dma_set_(coherent_)mask check swiotlb_dma_supported() which returns > > false if io_tlb_end goes beyond the device mask. So we just need to > > ensure that io_tlb is allocated within ZONE_DMA. > > Makes sense for dma_set_mask. Why do you do the same thing for > coherent_mask? Shouldn't that check against ZONE_DMA instead? It depends on the meaning of coherent_dma_mask. As I understand it, that's the dma mask use for dma_alloc_coherent() to return a memory buffer that the device is able to access. I don't see it much different from the dma_mask used by the streaming API. I guess some hardware could have different masks here if they have cache coherency only for certain memory ranges (and the coherent_dma_mask would be smaller than dma_mask). > > > For dma_set_coherent_mask(), we also have to fail any call that tries to > > > set a mask larger than what the device hardware can do. Unlike that, > > > dma_set_mask() can succeed with any mask, we just have to enable swiotlb > > > if the mask that the driver wants is larger than what the hardware can > > > do. > > > > Currently we can't satisfy any arbitrarily small dma mask even with > > swiotlb since the bounce buffer is just guaranteed to be in ZONE_DMA. > > Swiotlb allows for smaller masks but we need to reserve the io_tlb > > buffer early during boot and at smaller addresses. For example, > > swiotlb_alloc_coherent() first tries __get_free_pages(GFP_DMA) and if > > the coherent_dma_mask isn't matched, it frees the pages and falls back > > to the io_tlb buffer. However, I don't think it's worth going for masks > > smaller than 32-bit on arm64. > > Is that safe for noncoherent systems? I'd expect the io_tlb buffer > to be cached there, which means we can't use it for coherent allocations. For non-coherent systems, the current arm64 approach is to take the address returned by swiotlb_alloc_coherent() and remap it as Normal NonCacheable (see the arm64 __dma_alloc_noncoherent()). The CPU should only access the dma buffer via the non-cacheable mapping. Of course, another approach would be to change the cacheability attributes of the io_tlb buffer (or the CMA one) but current approach has the advantage that the io_tlb buffer can be used for both coherent and non-coherent devices. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html