On 6/1/21 2:27 PM, Jani Nikula wrote:
On Tue, 01 Jun 2021, Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> wrote:
Reading out of write-combining mapped memory is typically very slow
since the CPU doesn't prefetch. However some archs have special
instructions to do this.
So add a best-effort memcpy_from_wc taking dma-buf-map pointer
arguments that attempts to use a fast prefetching memcpy and
otherwise falls back to ordinary memcopies, taking the iomem tagging
into account.
The code is largely copied from i915_memcpy_from_wc.
Cc: Daniel Vetter <daniel@xxxxxxxx>
Cc: Christian König <christian.koenig@xxxxxxx>
Suggested-by: Daniel Vetter <daniel@xxxxxxxx>
Signed-off-by: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>
Acked-by: Christian König <christian.koenig@xxxxxxx>
Acked-by: Daniel Vetter <daniel@xxxxxxxx>
---
v7:
- Perform a memcpy even if warning with in_interrupt(). Suggested by
Christian König.
- Fix compilation failure on !X86 (Reported by kernel test robot
lkp@xxxxxxxxx)
v8:
- Skip kerneldoc for drm_memcpy_init_early()
- Export drm_memcpy_from_wc() also for non-x86.
---
Documentation/gpu/drm-mm.rst | 2 +-
drivers/gpu/drm/drm_cache.c | 148 +++++++++++++++++++++++++++++++++++
drivers/gpu/drm/drm_drv.c | 2 +
include/drm/drm_cache.h | 7 ++
4 files changed, 158 insertions(+), 1 deletion(-)
diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index 21be6deadc12..c66058c5bce7 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -469,7 +469,7 @@ DRM MM Range Allocator Function References
.. kernel-doc:: drivers/gpu/drm/drm_mm.c
:export:
-DRM Cache Handling
+DRM Cache Handling and Fast WC memcpy()
==================
The title underline needs to be as long as the title.
BR,
Jani.
Thanks, Jani.
I think Daniel was trying to point this out to me as well with limited
success. It's fixed now.
/Thomas