On Wed, Apr 29, 2015 at 03:01:58PM +0530, ankitprasad.r.sharma@xxxxxxxxx wrote: > From: Ankitprasad Sharma <ankitprasad.r.sharma@xxxxxxxxx> > > This patch adds support for extending the pread/pwrite functionality > for objects not backed by shmem. The access will be made through > gtt interface. > This will cover prime objects as well as stolen memory backed objects > but for userptr objects it is still forbidden. > > testcase: igt/gem_create_stolen > > Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@xxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_drv.h | 7 +++ > drivers/gpu/drm/i915/i915_gem.c | 125 +++++++++++++++++++++++++++++++++++++--- > 2 files changed, 123 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 21a2b1f..a568cd1 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -3270,4 +3270,11 @@ static inline void i915_trace_irq_get(struct intel_engine_cs *ring, > i915_gem_request_assign(&ring->trace_irq_req, req); > } > > +/* Checking if pread/pwrite is allowed for the object */ > +inline static bool i915_gem_obj_is_prw_allowed(struct drm_i915_gem_object *obj) > +{ > + /* pread/pwrite is forbidden for userptrs */ > + return !obj->userptr.mm; > +} Not quite. Just wasn't implemented (as there wasn't much point since we couldn't provide other CPU access) but there is no reason to arbitrary limit the API - especially as you are in the process of removing that arbitrary limitation! > + > #endif > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 81c5381..3491bd3 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -631,6 +631,96 @@ shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length, > return ret ? - EFAULT : 0; > } > > +static inline int > +slow_user_access(struct io_mapping *mapping, > + loff_t page_base, int page_offset, > + char __user *user_data, > + int length, bool write) > +{ > + void __iomem *vaddr_inatomic; > + void *vaddr; > + unsigned long unwritten; > + > + vaddr_inatomic = io_mapping_map_wc(mapping, page_base); > + /* We can use the cpu mem copy function because this is X86. */ > + vaddr = (void __force *)vaddr_inatomic + page_offset; > + if (write) > + unwritten = __copy_from_user(vaddr, user_data, length); > + else > + unwritten = __copy_to_user(user_data, vaddr, length); > + > + io_mapping_unmap(vaddr_inatomic); > + return unwritten; > +} > + > +static int > +i915_gem_gtt_pread_pwrite(struct drm_device *dev, > + struct drm_i915_gem_object *obj, uint64_t size, > + uint64_t data_offset, uint64_t data_ptr, bool write) > +{ > + struct drm_i915_private *dev_priv = dev->dev_private; > + char __user *user_data; > + ssize_t remain; > + loff_t offset, page_base; > + int page_offset, page_length, ret = 0; > + > + ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE); > + if (ret) > + goto out; > + > + ret = i915_gem_object_set_to_gtt_domain(obj, write); > + if (ret) > + goto out_unpin; > + > + ret = i915_gem_object_put_fence(obj); > + if (ret) > + goto out_unpin; > + > + user_data = to_user_ptr(data_ptr); > + remain = size; > + > + offset = i915_gem_obj_ggtt_offset(obj) + data_offset; > + > + if (write) > + intel_fb_obj_invalidate(obj, NULL, ORIGIN_GTT); > + > + while (remain > 0) { > + /* Operation in this page > + * > + * page_base = page offset within aperture > + * page_offset = offset within page > + * page_length = bytes to copy for this page > + */ > + page_base = offset & PAGE_MASK; > + page_offset = offset_in_page(offset); > + page_length = remain; > + if ((page_offset + remain) > PAGE_SIZE) > + page_length = PAGE_SIZE - page_offset; > + > + /* This is a slow read/write as it tries to read from > + * and write to user memory which may result into page > + * faults > + */ Hmm, I am in the process of rewriting this function (GTT pwrite) to reduce aperture pressure (which will also be important here). However for your use case, we do need to drop the locks around slow_user_access() so that we can use non-atomic copy functions and *pagefault*. We do need the full level of complexity like i915_gem_shmem_pread() -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx