Re: [PATCH] drm/i915/ehl: unconditionally flush the pages on acquire

Daniel Vetter <daniel@xxxxxxxx> · Fri, 9 Jul 2021 18:13:12 +0200

On Fri, Jul 9, 2021 at 5:19 PM Matthew Auld <matthew.auld@xxxxxxxxx> wrote:
>
> EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> possible for userspace to bypass the GTT caching bits set by the kernel,
> as per the given object cache_level. This is troublesome since the heavy
> flush we apply when first acquiring the pages is skipped if the kernel
> thinks the object is coherent with the GPU. As a result it might be
> possible to bypass the cache and read the contents of the page directly,
> which could be stale data. If it's just a case of userspace shooting
> themselves in the foot then so be it, but since i915 takes the stance of
> always zeroing memory before handing it to userspace, we need to prevent
> this.
>
> BSpec: 34007
> References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"")
> Signed-off-by: Matthew Auld <matthew.auld@xxxxxxxxx>
> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@xxxxxxxxx>
> Cc: Francisco Jerez <francisco.jerez.plata@xxxxxxxxx>
> Cc: Lucas De Marchi <lucas.demarchi@xxxxxxxxx>
> Cc: Jon Bloomfield <jon.bloomfield@xxxxxxxxx>
> Cc: Chris Wilson <chris.p.wilson@xxxxxxxxx>
> Cc: Matt Roper <matthew.d.roper@xxxxxxxxx>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 29 +++++++++++++++++++++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index 6a04cce188fc..7e9ec68cce9e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -298,11 +298,12 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
>
>  void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages)
>  {
> +       struct drm_i915_private *i915 = to_i915(obj->base.dev);
>         struct sgt_iter sgt_iter;
>         struct pagevec pvec;
>         struct page *page;
>
> -       GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev)));
> +       GEM_WARN_ON(IS_DGFX(i915));
>         __i915_gem_object_release_shmem(obj, pages, true);
>
>         i915_gem_gtt_finish_pages(obj, pages);
> @@ -325,7 +326,12 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_
>         }
>         if (pagevec_count(&pvec))
>                 check_release_pagevec(&pvec);
> -       obj->mm.dirty = false;
> +
> +       /* See the comment in shmem_object_init() for why we need this */
> +       if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
> +               obj->mm.dirty = true;
> +       else
> +               obj->mm.dirty = false;
>
>         sg_free_table(pages);
>         kfree(pages);
> @@ -539,6 +545,25 @@ static int shmem_object_init(struct intel_memory_region *mem,
>
>         i915_gem_object_set_cache_coherency(obj, cache_level);
>
> +       /*
> +        * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> +        * possible for userspace to bypass the GTT caching bits set by the
> +        * kernel, as per the given object cache_level. This is troublesome
> +        * since the heavy flush we apply when first gathering the pages is
> +        * skipped if the kernel thinks the object is coherent with the GPU. As
> +        * a result it might be possible to bypass the cache and read the
> +        * contents of the page directly, which could be stale data. If it's
> +        * just a case of userspace shooting themselves in the foot then so be
> +        * it, but since i915 takes the stance of always zeroing memory before
> +        * handing it to userspace, we need to prevent this.
> +        *
> +        * By setting cache_dirty here we make the clflush when first acquiring
> +        * the pages unconditional on such platforms. We also set this again in
> +        * put_pages().
> +        */
> +       if (IS_JSL_EHL(i915) && flags & I915_BO_ALLOC_USER)
> +               obj->cache_dirty = true;

I don't think this is enough, because every time we drop our pages
shmem could move them around or swap them out, and we get fresh ones.
So we need to re-force this every time we grab new pages.

Also there's already a pile of other cases (well not WB coherency
mode) where userspace can be clever and bypass the coherency if we
don't clflush first. I think it'd be really good to have all that in
one places as much as possible.

Finally this is extremely tricky code, and obj->cache_dirty and
related stuff isn't really documented. kerneldoc for all that would be
really good.
-Daniel

> +
>         i915_gem_object_init_memory_region(obj, mem);
>
>         return 0;
> --
> 2.26.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx