Re: [PATCH 21/42] drm/i915: Implement pwrite without struct-mutex

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Thu, 13 Oct 2016 12:54:07 +0100

On Thu, Oct 13, 2016 at 02:17:52PM +0300, Joonas Lahtinen wrote:
> On pe, 2016-10-07 at 10:46 +0100, Chris Wilson wrote:
> > +/* Per-page copy function for the shmem pwrite fastpath.
> > + * Flushes invalid cachelines before writing to the target if
> > + * needs_clflush_before is set and flushes out any written cachelines after
> > + * writing if needs_clflush is set.
> > + */
> >  static int
> > -i915_gem_shmem_pwrite(struct drm_device *dev,
> > -		      struct drm_i915_gem_object *obj,
> > -		      struct drm_i915_gem_pwrite *args,
> > -		      struct drm_file *file)
> > +shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
> > +	     bool page_do_bit17_swizzling,
> > +	     bool needs_clflush_before,
> > +	     bool needs_clflush_after)
> 
> I remember having complaints of two bool arguments in same func. Could
> these tree be fixed while mangling the code otherwise too? Or as a
> follow-up.

They are three different evaluations on each loop. Looks to be bit of a
hassle to amalgamate them into one in the caller. Though

(pages_to_phys() & BIT(17)) |
((offset | length) & 63) |
(needs_clflush & 2)

almost works (just need to choose another bit for needs_clflush). I'm
not convinced you'd like the unpacking in shmem_pwrite() ;)
Something like:

static int
shmem_pwrite_slow(struct page *page, int offset, int length,
                  char __user *user_data,
                  unsigned flags)
{
        char *vaddr;
        int ret;

        vaddr = kmap(page);
        if (unlikely(flags & (BIT(17) | 63)))
                shmem_clflush_swizzled_range(vaddr + offset, length, flags);
        if (flags & BIT(17))
                ret = __copy_from_user_swizzled(vaddr, offset, user_data,
                                                length);
        else
                ret = __copy_from_user(vaddr + offset, user_data, length);
        if (flags & BIT(16))
                shmem_clflush_swizzled_range(vaddr + offset, length, flags));
        kunmap(page);

        return ret ? -EFAULT : 0;
}

> > +static int
> > +i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj,
> > +		      const struct drm_i915_gem_pwrite *args)
> > +{
> > +	void __user *user_data;
> > +	u64 remain;
> > +	unsigned int obj_do_bit17_swizzling;
> > +	unsigned int partial_cacheline_write;
> 
> partial_cacheline_mask might be more descriptive?

To me that says we want to mask off the partial cacheline, or use them
constructively. Imho, 

	len & partial_cacheline_write

reads more clearly as a boolean question than

	len & partial_cacheline_mask.

-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx