On Thu, Aug 15, 2019 at 10:53 AM Christoph Hellwig <hch@xxxxxx> wrote: > > On Thu, Aug 15, 2019 at 06:54:39AM -0700, Rob Clark wrote: > > On Wed, Aug 14, 2019 at 11:51 PM Christoph Hellwig <hch@xxxxxx> wrote: > > > > > > As said before I don't think these low-level helpers are the > > > right API to export, but even if they did you'd just cover a tiny > > > subset of the architectures. > > > > Are you thinking instead something like: > > > > void dma_sync_sg_for_{cpu,device}(struct device *dev, struct scatterlist *sgl, > > int nents, enum dma_data_direction dir) > > { > > for_each_sg(sgl, sg, nents, i) { > > arch_sync_dma_for_..(dev, sg_phys(sg), sg->length, dir); > > } > > } > > EXPORT_SYMBOL_GPL(dma_sync_sg_for_..) > > > > or did you have something else in mind? > > No. We really need an interface thay says please give me uncached > memory (for some definition of uncached that includes that grapics > drivers call write combine), and then let the architecture do the right > thing. Basically dma_alloc_coherent with DMA_ATTR_NO_KERNEL_MAPPING > is superficially close to what you want, except that the way the drm > drivers work you can't actually use it. I don't disagree about needing an API to get uncached memory (or ideally just something outside of the linear map). But I think this is a separate problem. What I was hoping for, for v5.4, is a way to stop abusing dma_map/sync for cache ops to get rid of the hack I had to make for v5.3. And also to fix vgem on non-x86. (Unfortunately changing vgem to used cached mappings breaks x86 CI, but fixes CI on arm/arm64..) We can do that without any changes in allocation. There is still the possibility for problems due to cached alias, but that has been a problem this whole time, it isn't something new. BR, -R > The reason for that is if we can we really need to not create another > uncachable alias, but instead change the page attributes in place. > On x86 we can and must do that for example, and based on the > conversation with Will arm64 could do that fairly easily. arm32 can > right now only do that for CMA, though. > > The big question is what API do we want. I had a pretty similar > discussion with Christian on doing such an allocation for amdgpu, > where the device normally is cache coherent, but they actually want > to turn it into non-coherent by using PCIe unsnooped transactions. > > Here is my high level plan, which still has a few lose end: > > (1) provide a new API: > > struct page *dma_alloc_pages(struct device *dev, unsigned nr_pages, > gfp_t gfp, unsigned long flags); > void dma_free_pages(struct device *dev, unsigned nr_pages, > unsigned long flags); > > These give you back page backed memory that is guaranteed to be > addressable by the device (no swiotlb or similar). The memory can > then be mapped using dma_map*, including unmap and dma_sync to > bounce ownership around. This is the replacement for the current > dma_alloc_attrs with DMA_ATTR_NON_CONSISTENT API, that is rather > badly defined. > > (2) Add support for DMA_ATTR_NO_KERNEL_MAPPING to this new API instead > of dma_alloc_attrs. The initial difference with that flag is just > that we allow highmem, but in the future we could also unmap this > memory from the kernel linear mapping entirely on architectures > where we can easily do that. > > (3) Add a dma_pages_map/dma_pages_unmap or similar API that allows you > to get a kernel mapping for parts or all of a > DMA_ATTR_NO_KERNEL_MAPPING allocation. This is to replace things > like your open-coded vmap in msm (or similarly elsewhere in dma-buf > providers). > > (4) Add support for a DMA_ATTR_UNCACHABLE flags (or similar) to the new > API, that maps the pages as uncachable iff they have a kernel > mapping, including invalidating the caches at time of this page > attribute change (or creation of a new mapping). This API will fail > if the architecture does not allow in-place remapping. Note that for > arm32 we could always dip into the CMA pool if one is present to not > fail. We'll also need some helper to map from the DMA_ATTR_* flags > to a pgprot for mapping the page to userspace. There is also a few > other weird bits here, e.g. on architectures like mips that use an > uncached segment we'll have to fail use with the plain > DMA_ATTR_UNCACHABLE flag, but it could be supported with > DMA_ATTR_UNCACHABLE | DMA_ATTR_NO_KERNEL_MAPPING. > > I was hoping to get most of this done for this merge window, but I'm > probably lucky if I get at least parts done. Too much distraction. > > > Hmm, not entirely sure why.. you should be on the cc list for each > > individual patch. > > They finally made it, although even with the delay they only ended up > in the spam mailbox. I still can't see them on the various mailing > lists. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel