On Sun, May 04, 2014 at 04:48:24PM +0530, akash.goel@xxxxxxxxx wrote: > From: Akash Goel <akash.goel@xxxxxxxxx> > > This patch could help to reduce the time, 'struct_mutex' is kept > locked during either the exec-buffer path or Page fault > handling path as now the backing pages are requested from shmem layer > without holding the 'struct_mutex'. I'm not keen on having an extra pass inside execbuffer for what should be relatively rare, but in discussion the idea of adding a flag to create2 (I915_CREATE_POPULATE) should cover the usecase nicely (having to pay the shmemfs allocation+zero price without holding struct_mutex). > +void > +i915_gem_object_shmem_preallocate(struct drm_i915_gem_object *obj) This is unlocked. We have to be loud and clear about that static void __i915_gem_object_shmemfs_populate(obj) > +{ > + int page_count, i; > + struct address_space *mapping; > + struct page *page; > + gfp_t gfp; > + > + if (obj->pages) > + return; if (READ_ONCE(obj->pages) return; > + if (obj->madv != I915_MADV_WILLNEED) { > + DRM_ERROR("Attempting to preallocate a purgeable object\n"); Ideally if (READ_ONCE(obj->madv)... However, that requires a patch to move madv to unsigned obj->flags. A user controllable *ERROR*? No, just make it debug. > + return; > + } > + > + if (obj->base.filp) { > + int ret; > + struct inode *inode = file_inode(obj->base.filp); > + struct shmem_inode_info *info = SHMEM_I(inode); > + if (!inode) > + return; Assume it exists. If it doesn't it is an internal bug so we should just GPF out of here. > + /* The alloced field stores how many data pages are allocated > + * to the file. If already shmem space has been allocated for > + * the object, then we can simply return */ > + spin_lock(&info->lock); > + ret = info->alloced; > + spin_unlock(&info->lock); > + if (ret > 0) { > + DRM_DEBUG_DRIVER("Already shmem space alloced for obj %p, %d pages\n", > + obj, ret); > + return; > + } I'm convinced this is of much use though. If the page is already allocated the shmem_read_mapping_page_gfp should be quick. Given the suggestion that we move this to create, it is clear that this won't be an issue. > + } else > + return; > + > + BUG_ON(obj->pages_pin_count); Requires struct_mutex; ignore. > + > + /* Assert that the object is not currently in any GPU domain. As it > + * wasn't in the GTT, there shouldn't be any way it could have been in > + * a GPU cache > + */ > + BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS); > + BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS); Requires struct_mutex; ignore. > + trace_i915_gem_obj_prealloc_start(obj, obj->base.size); You only need to pass obj to the tracer, it can work out the size itself. > + page_count = obj->base.size / PAGE_SIZE; > + > + /* Get the list of pages out of our struct file > + * Fail silently without starting the shrinker > + */ > + mapping = file_inode(obj->base.filp)->i_mapping; > + gfp = mapping_gfp_mask(mapping); > + gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD; > + gfp &= ~(__GFP_IO | __GFP_WAIT); Interesting. Disabling shrinker/oom for these pages - we will hit it later of course. But for the purposes of a quick preallocation, seems fine. > + for (i = 0; i < page_count; i++) { > + page = shmem_read_mapping_page_gfp(mapping, i, gfp); > + if (IS_ERR(page)) { > + DRM_DEBUG_DRIVER("Failure for obj(%p), size(%x) at page(%d)\n", > + obj, obj->base.size, i); > + break; > + } > + /* Decrement the extra ref count on the returned page, > + otherwise when 'get_pages_gtt' will be called later on > + in the regular path, it will also increment the ref count, > + which will disturb the ref count management */ > + page_cache_release(page); > + } > + > + trace_i915_gem_obj_prealloc_end(obj); > +} -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx