On Wed, Apr 25, 2018 at 8:13 AM, Daniel Vetter <daniel@xxxxxxxx> wrote: > On Wed, Apr 25, 2018 at 7:48 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: >> On Tue, Apr 24, 2018 at 09:32:20PM +0200, Daniel Vetter wrote: >>> Out of curiosity, how much virtual flushing stuff is there still out >>> there? At least in drm we've pretty much ignore this, and seem to be >>> getting away without a huge uproar (at least from driver developers >>> and users, core folks are less amused about that). >> >> As I've just been wading through the code, the following architectures >> have non-coherent dma that flushes by virtual address for at least some >> platforms: >> >> - arm [1], arm64, hexagon, nds32, nios2, parisc, sh, xtensa, mips, >> powerpc >> >> These have non-coherent dma ops that flush by physical address: >> >> - arc, arm [1], c6x, m68k, microblaze, openrisc, sparc >> >> And these do not have non-coherent dma ops at all: >> >> - alpha, h8300, riscv, unicore32, x86 >> >> [1] arm ѕeems to do both virtually and physically based ops, further >> audit needed. >> >> Note that using virtual addresses in the cache flushing interface >> doesn't mean that the cache actually is virtually indexed, but it at >> least allows for the possibility. >> >>> > I think the most important thing about such a buffer object is that >>> > it can distinguish the underlying mapping types. While >>> > dma_alloc_coherent, dma_alloc_attrs with DMA_ATTR_NON_CONSISTENT, >>> > dma_map_page/dma_map_single/dma_map_sg and dma_map_resource all give >>> > back a dma_addr_t they are in now way interchangable. And trying to >>> > stuff them all into a structure like struct scatterlist that has >>> > no indication what kind of mapping you are dealing with is just >>> > asking for trouble. >>> >>> Well the idea was to have 1 interface to allow all drivers to share >>> buffers with anything else, no matter how exactly they're allocated. >> >> Isn't that interface supposed to be dmabuf? Currently dma_map leaks >> a scatterlist through the sg_table in dma_buf_map_attachment / >> ->map_dma_buf, but looking at a few of the callers it seems like they >> really do not even want a scatterlist to start with, but check that >> is contains a physically contiguous range first. So kicking the >> scatterlist our there will probably improve the interface in general. > > I think by number most drm drivers require contiguous memory (or an > iommu that makes it look contiguous). But there's plenty others who > have another set of pagetables on the gpu itself and can > scatter-gather. Usually it's the former for display/video blocks, and > the latter for rendering. For more fun: https://www.spinics.net/lists/dri-devel/msg173630.html Yeah, sometimes we want to disable the iommu because the on-gpu pagetables are faster ... -Daniel >>> dma-buf has all the functions for flushing, so you can have coherent >>> mappings, non-coherent mappings and pretty much anything else. Or well >>> could, because in practice people hack up layering violations until it >>> works for the 2-3 drivers they care about. On top of that there's the >>> small issue that x86 insists that dma is coherent (and that's true for >>> most devices, including v4l drivers you might want to share stuff >>> with), and gpus really, really really do want to make almost >>> everything incoherent. >> >> How do discrete GPUs manage to be incoherent when attached over PCIe? > > It has a non-coherent transaction mode (which the chipset can opt to > not implement and still flush), to make sure the AGP horror show > doesn't happen again and GPU folks are happy with PCIe. That's at > least my understanding from digging around in amd the last time we had > coherency issues between intel and amd gpus. GPUs have some bits > somewhere (in the pagetables, or in the buffer object description > table created by userspace) to control that stuff. > > For anything on the SoC it's presented as pci device, but that's > extremely fake, and we can definitely do non-snooped transactions on > drm/i915. Again, controlled by a mix of pagetables and > userspace-provided buffer object description tables. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch