On Thu, Oct 13, 2016 at 02:17:52PM +0300, Joonas Lahtinen wrote: > On pe, 2016-10-07 at 10:46 +0100, Chris Wilson wrote: > > +/* Per-page copy function for the shmem pwrite fastpath. > > + * Flushes invalid cachelines before writing to the target if > > + * needs_clflush_before is set and flushes out any written cachelines after > > + * writing if needs_clflush is set. > > + */ > > static int > > -i915_gem_shmem_pwrite(struct drm_device *dev, > > - struct drm_i915_gem_object *obj, > > - struct drm_i915_gem_pwrite *args, > > - struct drm_file *file) > > +shmem_pwrite(struct page *page, int offset, int len, char __user *user_data, > > + bool page_do_bit17_swizzling, > > + bool needs_clflush_before, > > + bool needs_clflush_after) > > I remember having complaints of two bool arguments in same func. Could > these tree be fixed while mangling the code otherwise too? Or as a > follow-up. They are three different evaluations on each loop. Looks to be bit of a hassle to amalgamate them into one in the caller. Though (pages_to_phys() & BIT(17)) | ((offset | length) & 63) | (needs_clflush & 2) almost works (just need to choose another bit for needs_clflush). I'm not convinced you'd like the unpacking in shmem_pwrite() ;) Something like: static int shmem_pwrite_slow(struct page *page, int offset, int length, char __user *user_data, unsigned flags) { char *vaddr; int ret; vaddr = kmap(page); if (unlikely(flags & (BIT(17) | 63))) shmem_clflush_swizzled_range(vaddr + offset, length, flags); if (flags & BIT(17)) ret = __copy_from_user_swizzled(vaddr, offset, user_data, length); else ret = __copy_from_user(vaddr + offset, user_data, length); if (flags & BIT(16)) shmem_clflush_swizzled_range(vaddr + offset, length, flags)); kunmap(page); return ret ? -EFAULT : 0; } > > +static int > > +i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj, > > + const struct drm_i915_gem_pwrite *args) > > +{ > > + void __user *user_data; > > + u64 remain; > > + unsigned int obj_do_bit17_swizzling; > > + unsigned int partial_cacheline_write; > > partial_cacheline_mask might be more descriptive? To me that says we want to mask off the partial cacheline, or use them constructively. Imho, len & partial_cacheline_write reads more clearly as a boolean question than len & partial_cacheline_mask. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx