On Wed, Jul 14, 2021 at 01:54:50PM +0200, Christian König wrote: > Am 13.07.21 um 22:51 schrieb Daniel Vetter: > > intel-gfx-ci realized that something is not quite coherent anymore on > > some platforms for our i915+vgem tests, when I tried to switch vgem > > over to shmem helpers. > > > > After lots of head-scratching I realized that I've removed calls to > > drm_clflush. And we need those. To make this a bit cleaner use the > > same page allocation tooling as ttm, which does internally clflush > > (and more, as neeeded on any platform instead of just the intel x86 > > cpus i915 can be combined with). > > > > Unfortunately this doesn't exist on arm, or as a generic feature. For > > that I think only the dma-api can get at wc memory reliably, so maybe > > we'd need some kind of GFP_WC flag to do this properly. > > The problem is that this stuff is extremely architecture specific. So GFP_WC > and GFP_UNCACHED are really what we should aim for in the long term. > > And as far as I know we have at least the following possibilities how it is > implemented: > > * A fixed amount of registers which tells the CPU the caching behavior for a > memory region, e.g. MTRR. > * Some bits of the memory pointers used, e.g. you see the same memory at > different locations with different caching attributes. > * Some bits in the CPUs page table. > * Some bits in a separate page table. > > On top of that there is the PCIe specification which defines non-cache > snooping access as an extension. Yeah dma-buf is extremely ill-defined even on x86 if you combine these all. We just play a game of whack-a-mole with the cacheline dirt until it's gone. That's the other piece here, how do you even make sure that the page is properly flushed and ready for wc access: - easy case is x86 with clflush available pretty much everywhere (since 10+ years at least) - next are cpus which have some cache flush instructions, but it's highly cpu model specific - next up is the same, but you absolutely have to make sure there's no other mapping around anymore or the coherency fabric just dies - and I'm pretty sure there's worse stuff where you defacto can only allocate wc memory that's set aside at boot-up and that's all you ever get. Cheers, Daniel > Mixing that with the CPU caching behavior gets you some really nice ways to > break a driver. In general x86 seems to be rather graceful, but arm and > PowerPC are easily pissed if you mess that up. > > > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> > > Cc: Christian König <christian.koenig@xxxxxxx> > > Cc: "Thomas Hellström" <thomas.hellstrom@xxxxxxxxxxxxxxx> > > Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx> > > Cc: Maxime Ripard <mripard@xxxxxxxxxx> > > Cc: Thomas Zimmermann <tzimmermann@xxxxxxx> > > Cc: David Airlie <airlied@xxxxxxxx> > > Cc: Daniel Vetter <daniel@xxxxxxxx> > > Acked-by: Christian könig <christian.koenig@xxxxxxx> > > Regards, > Christian. > > > --- > > drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++++++++++++++ > > 1 file changed, 14 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c > > index 296ab1b7c07f..657d2490aaa5 100644 > > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > > @@ -10,6 +10,10 @@ > > #include <linux/slab.h> > > #include <linux/vmalloc.h> > > +#ifdef CONFIG_X86 > > +#include <asm/set_memory.h> > > +#endif > > + > > #include <drm/drm.h> > > #include <drm/drm_device.h> > > #include <drm/drm_drv.h> > > @@ -162,6 +166,11 @@ static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem) > > return PTR_ERR(pages); > > } > > +#ifdef CONFIG_X86 > > + if (shmem->map_wc) > > + set_pages_array_wc(pages, obj->size >> PAGE_SHIFT); > > +#endif > > + > > shmem->pages = pages; > > return 0; > > @@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem) > > if (--shmem->pages_use_count > 0) > > return; > > +#ifdef CONFIG_X86 > > + if (shmem->map_wc) > > + set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT); > > +#endif > > + > > drm_gem_put_pages(obj, shmem->pages, > > shmem->pages_mark_dirty_on_put, > > shmem->pages_mark_accessed_on_put); > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch