On Tue, Nov 12, 2024 at 04:28:28PM +0000, Matthew Auld wrote: > Starting from LNL, CCS has moved over to flat CCS model where there is > now dedicated memory reserved for storing compression state. On > platforms like LNL this reserved memory lives inside graphics stolen > memory, which is not treated like normal RAM and is therefore skipped by > the core kernel when creating the hibernation image. Currently if > something was compressed and we enter hibernation all the corresponding > CCS state is lost on such HW, resulting in corrupted memory. To fix this > evict user buffers from TT -> SYSTEM to ensure we take a snapshot of the > raw CCS state when entering hibernation, where upon resuming we can > restore the raw CCS state back when next validating the buffer. This has > been confirmed to fix display corruption on LNL when coming back from > hibernation. > > Fixes: cbdc52c11c9b ("drm/xe/xe2: Support flat ccs") > Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3409 > Signed-off-by: Matthew Auld <matthew.auld@xxxxxxxxx> > Cc: Matthew Brost <matthew.brost@xxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # v6.8+ > --- > drivers/gpu/drm/xe/xe_bo_evict.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c b/drivers/gpu/drm/xe/xe_bo_evict.c > index b01bc20eb90b..8fb2be061003 100644 > --- a/drivers/gpu/drm/xe/xe_bo_evict.c > +++ b/drivers/gpu/drm/xe/xe_bo_evict.c > @@ -35,10 +35,21 @@ int xe_bo_evict_all(struct xe_device *xe) > int ret; > > /* User memory */ > - for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1; ++mem_type) { > + for (mem_type = XE_PL_TT; mem_type <= XE_PL_VRAM1; ++mem_type) { > struct ttm_resource_manager *man = > ttm_manager_type(bdev, mem_type); > > + /* > + * On igpu platforms with flat CCS we need to ensure we save and restore any CCS > + * state since this state lives inside graphics stolen memory which doesn't survive > + * hibernation. > + * > + * This can be further improved by only evicting objects that we know have actually > + * used a compression enabled PAT index. yeap, but for now let's keep it simple... Reviewed-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > + */ > + if (mem_type == XE_PL_TT && (IS_DGFX(xe) || !xe_device_has_flat_ccs(xe))) > + continue; > + > if (man) { > ret = ttm_resource_manager_evict_all(bdev, man); > if (ret) > -- > 2.47.0 >