Am 16.01.19 um 08:09 schrieb Thomas Hellstrom: > On Tue, 2019-01-15 at 21:58 +0100, hch@xxxxxx wrote: >> On Tue, Jan 15, 2019 at 07:13:11PM +0000, Koenig, Christian wrote: >>> Thomas is correct that the interface you propose here doesn't work >>> at >>> all for GPUs. >>> >>> The kernel driver is not informed of flush/sync, but rather just >>> setups >>> coherent mappings between system memory and devices. >>> >>> In other words you have an array of struct pages and need to map >>> that to >>> a specific device and so create dma_addresses for the mappings. >> If you want a coherent mapping you need to use dma_alloc_coherent >> and dma_mmap_coherent and you are done, that is not the problem. >> That actually is one of the vmgfx modes, so I don't understand what >> problem we are trying to solve if you don't actually want a non- >> coherent mapping. > For vmwgfx, not making dma_alloc_coherent default has a couple of > reasons: > 1) Memory is associated with a struct device. It has not been clear > that it is exportable to other devices. > 2) There seems to be restrictions in the system pages allowable. GPUs > generally prefer highmem pages but dma_alloc_coherent returns a virtual > address implying GFP_KERNEL? While not used by vmwgfx, TTM typically > prefers HIGHMEM pages to facilitate caching mode switching without > having to touch the kernel map. > 3) Historically we had APIs to allow coherent access to user-space > defined pages. That has gone away not but the infrastructure was built > around it. > > dma_mmap_coherent isn't use because as the data moves between system > memory, swap and VRAM, PTEs of user-space mappings are adjusted > accordingly, meaning user-space doesn't have to unmap when an operation > is initiated that might mean the data is moved. To summarize once more: We have an array of struct pages and want to coherently map that to a device. If that is not possible because of whatever reason we want to get an error code or even not load the driver from the beginning. > > >> Although last time I had that discussion with Daniel Vetter >> I was under the impressions that GPUs really wanted non-coherent >> mappings. > Intel historically has done things a bit differently. And it's also > possible that embedded platforms and ARM prefer this mode of operation, > but I haven't caught up on that discussion. > >> But if you want a coherent mapping you can't go to a struct page, >> because on many systems you can't just map arbitrary memory as >> uncachable. It might either come from very special limited pools, >> or might need other magic applied to it so that it is not visible >> in the normal direct mapping, or at least not access through it. > > The TTM subsystem has been relied on to provide coherent memory with > the option to switch caching mode of pages. But only on selected and > well tested platforms. On other platforms we simply do not load, and > that's fine for now. > > But as mentioned multiple times, to make GPU drivers more compliant, > we'd really want that > > bool dma_streaming_is_coherent(const struct device *) > > API to help us decide when to load or not. Yes, please. Christian. > > Thanks, > Thomas > > > > > > >