We found a regression in v5.10 on real-time server, using the rt-kernel and the mgag200 driver. It's some really specialized workload, with <10us latency expectation on isolated core. After the v5.10, the real time tasks missed their <10us latency when something prints on the screen (fbcon or printk) The regression has been bisected to 2 commits: 0b34d58b6c32 ("drm/mgag200: Enable caching for SHMEM pages") 4862ffaec523 ("drm/mgag200: Move vmap out of commit tail") The first one changed the system memory framebuffer from Write-Combine to the default caching. Before the second commit, the mgag200 driver used to unmap the framebuffer after each frame, which implicitly does a cache flush. Both regressions are fixed by the following patch, which forces a cache flush after each frame, reverting to almost v5.9 behavior. This is necessary only if you have strong realtime constraints, so I put the cache flush under the CONFIG_PREEMPT_RT config flag. Also clflush is only availabe on x86, (and this issue has only been reproduced on x86_64) so it's also under the CONFIG_X86 config flag. Fixes: 0b34d58b6c32 ("drm/mgag200: Enable caching for SHMEM pages") Fixes: 4862ffaec523 ("drm/mgag200: Move vmap out of commit tail") Signed-off-by: Jocelyn Falempe <jfalempe@xxxxxxxxxx> --- drivers/gpu/drm/mgag200/mgag200_mode.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c index af3ce5a6a636..11660cd29cea 100644 --- a/drivers/gpu/drm/mgag200/mgag200_mode.c +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c @@ -13,6 +13,7 @@ #include <drm/drm_atomic.h> #include <drm/drm_atomic_helper.h> +#include <drm/drm_cache.h> #include <drm/drm_damage_helper.h> #include <drm/drm_format_helper.h> #include <drm/drm_fourcc.h> @@ -436,6 +437,10 @@ static void mgag200_handle_damage(struct mga_device *mdev, const struct iosys_ma iosys_map_incr(&dst, drm_fb_clip_offset(fb->pitches[0], fb->format, clip)); drm_fb_memcpy(&dst, fb->pitches, vmap, fb, clip); + /* On RT systems, flushing the cache reduces the latency for other RT tasks */ +#if defined(CONFIG_X86) && defined(CONFIG_PREEMPT_RT) + drm_clflush_virt_range(vmap, fb->height * fb->pitches[0]); +#endif } /* base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c -- 2.41.0