On Fri, Jul 23, 2021 at 10:02:39AM +0200, Christian König wrote: > Am 23.07.21 um 09:36 schrieb Daniel Vetter: > > On Thu, Jul 22, 2021 at 08:40:56PM +0200, Thomas Zimmermann wrote: > > > Hi > > > > > > Am 13.07.21 um 22:51 schrieb Daniel Vetter: > > > [SNIP] > > > > +#ifdef CONFIG_X86 > > > > + if (shmem->map_wc) > > > > + set_pages_array_wc(pages, obj->size >> PAGE_SHIFT); > > > > +#endif > > > I cannot comment much on the technical details of the caching of various > > > architectures. If this patch goes in, there should be a longer comment that > > > reflects the discussion in this thread. It's apparently a workaround. > > > > > > I think the call itself should be hidden behind a DRM API, which depends on > > > CONFIG_X86. Something simple like > > > > > > ifdef CONFIG_X86 > > > drm_set_pages_array_wc() > > > { > > > set_pages_array_wc(); > > > } > > > else > > > drm_set_pages_array_wc() > > > { > > > } > > > #endif > > > > > > Maybe in drm_cache.h? > > We do have a bunch of this in drm_cache.h already, and architecture > > maintainers hate us for it. > > Yeah, for good reasons :) > > > The real fix is to get at the architecture-specific wc allocator, which is > > currently not something that's exposed, but hidden within the dma api. I > > think having this stick out like this is better than hiding it behind fake > > generic code (like we do with drm_clflush, which defacto also only really > > works on x86). > > The DMA API also doesn't really touch that stuff as far as I know. > > What we rather do on other architectures is to set the appropriate caching > flags on the CPU mappings, see function ttm_prot_from_caching(). This alone doesn't do cache flushes. And at least on some arm cpus having inconsistent mappings can lead to interconnect hangs, so you have to at least punch out the kernel linear map. Which on some arms isn't possible (because the kernel map is a special linear map and not done with pagetables). Which means you need to carve this out at boot and treat them as GFP_HIGHMEM. Afaik dma-api has that allocator somewhere which dtrt for dma_alloc_coherent. Also shmem helpers already set the caching pgprot. > > Also note that ttm has the exact same ifdef in its page allocator, but it > > does fall back to using dma_alloc_coherent on other platforms. > > This works surprisingly well on non x86 architectures as well. We just don't > necessary update the kernel mappings everywhere which limits the kmap usage. > > In other words radeon and nouveau still work on PowerPC AGP systems as far > as I know for example. The thing is, on most cpus you get away with just pgprot set to wc, and on many others it's only an issue while there's still some cpu dirt hanging around because they don't prefetch badly enough. It's very few were it's a persistent problem. Really the only reason I've even caught this was because some of the i915+vgem buffer sharing tests we have are very nasty and intentionally try to provoke the worst case :-) Anyway, since you're looking, can you pls review this and the previous patch for shmem helpers? The first one to make VM_PFNMAP standard for all dma-buf isn't ready yet, because I need to audit all the driver still. And at least i915 dma-buf mmap is still using gup-able memory too. So more work to do here. -Danel > > Christian. > > > -Daniel > > > > > Best regard > > > Thomas > > > > > > > + > > > > shmem->pages = pages; > > > > return 0; > > > > @@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem) > > > > if (--shmem->pages_use_count > 0) > > > > return; > > > > +#ifdef CONFIG_X86 > > > > + if (shmem->map_wc) > > > > + set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT); > > > > +#endif > > > > + > > > > drm_gem_put_pages(obj, shmem->pages, > > > > shmem->pages_mark_dirty_on_put, > > > > shmem->pages_mark_accessed_on_put); > > > > > > > -- > > > Thomas Zimmermann > > > Graphics Driver Developer > > > SUSE Software Solutions Germany GmbH > > > Maxfeldstr. 5, 90409 Nürnberg, Germany > > > (HRB 36809, AG Nürnberg) > > > Geschäftsführer: Felix Imendörffer > > > > > > > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch