Am 14.07.21 um 14:48 schrieb Daniel Vetter:
On Wed, Jul 14, 2021 at 01:54:50PM +0200, Christian König wrote:
Am 13.07.21 um 22:51 schrieb Daniel Vetter:
intel-gfx-ci realized that something is not quite coherent anymore on
some platforms for our i915+vgem tests, when I tried to switch vgem
over to shmem helpers.
After lots of head-scratching I realized that I've removed calls to
drm_clflush. And we need those. To make this a bit cleaner use the
same page allocation tooling as ttm, which does internally clflush
(and more, as neeeded on any platform instead of just the intel x86
cpus i915 can be combined with).
Unfortunately this doesn't exist on arm, or as a generic feature. For
that I think only the dma-api can get at wc memory reliably, so maybe
we'd need some kind of GFP_WC flag to do this properly.
The problem is that this stuff is extremely architecture specific. So GFP_WC
and GFP_UNCACHED are really what we should aim for in the long term.
And as far as I know we have at least the following possibilities how it is
implemented:
* A fixed amount of registers which tells the CPU the caching behavior for a
memory region, e.g. MTRR.
* Some bits of the memory pointers used, e.g. you see the same memory at
different locations with different caching attributes.
* Some bits in the CPUs page table.
* Some bits in a separate page table.
On top of that there is the PCIe specification which defines non-cache
snooping access as an extension.
Yeah dma-buf is extremely ill-defined even on x86 if you combine these
all. We just play a game of whack-a-mole with the cacheline dirt until
it's gone.
That's the other piece here, how do you even make sure that the page is
properly flushed and ready for wc access:
- easy case is x86 with clflush available pretty much everywhere (since
10+ years at least)
- next are cpus which have some cache flush instructions, but it's highly
cpu model specific
- next up is the same, but you absolutely have to make sure there's no
other mapping around anymore or the coherency fabric just dies
- and I'm pretty sure there's worse stuff where you defacto can only
allocate wc memory that's set aside at boot-up and that's all you ever
get.
Well long story short you don't make sure that the page is flushed at all.
What you do is to allocate the page as WC in the first place, if you
fail to do this you can't use it.
The whole idea TTM try to sell until a while ago that you can actually
change that on the fly only works on x86 and even there only very very
limited.
Cheers,
Christian.
Cheers, Daniel
Mixing that with the CPU caching behavior gets you some really nice ways to
break a driver. In general x86 seems to be rather graceful, but arm and
PowerPC are easily pissed if you mess that up.
Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
Cc: Christian König <christian.koenig@xxxxxxx>
Cc: "Thomas Hellström" <thomas.hellstrom@xxxxxxxxxxxxxxx>
Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
Cc: Maxime Ripard <mripard@xxxxxxxxxx>
Cc: Thomas Zimmermann <tzimmermann@xxxxxxx>
Cc: David Airlie <airlied@xxxxxxxx>
Cc: Daniel Vetter <daniel@xxxxxxxx>
Acked-by: Christian könig <christian.koenig@xxxxxxx>
Regards,
Christian.
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 296ab1b7c07f..657d2490aaa5 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -10,6 +10,10 @@
#include <linux/slab.h>
#include <linux/vmalloc.h>
+#ifdef CONFIG_X86
+#include <asm/set_memory.h>
+#endif
+
#include <drm/drm.h>
#include <drm/drm_device.h>
#include <drm/drm_drv.h>
@@ -162,6 +166,11 @@ static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)
return PTR_ERR(pages);
}
+#ifdef CONFIG_X86
+ if (shmem->map_wc)
+ set_pages_array_wc(pages, obj->size >> PAGE_SHIFT);
+#endif
+
shmem->pages = pages;
return 0;
@@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
if (--shmem->pages_use_count > 0)
return;
+#ifdef CONFIG_X86
+ if (shmem->map_wc)
+ set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
+#endif
+
drm_gem_put_pages(obj, shmem->pages,
shmem->pages_mark_dirty_on_put,
shmem->pages_mark_accessed_on_put);