One thing I've forgotten: The Threading in your patch-bomb seems busted, the patches are somehow not linked up to the cover letter. If you use a recent git and format-patch/send-email with default settings it should all work out ok though. With proper in-reply linking patch bomb threads tend to get splattered all over the place in mail clients making it hard to keep an overview of the discussions. -Daniel On Thu, Jan 9, 2014 at 8:27 AM, Daniel Vetter <daniel@xxxxxxxx> wrote: > On Thu, Jan 09, 2014 at 11:01:02AM +0530, akash.goel@xxxxxxxxx wrote: >> From: Akash Goel <akash.goel@xxxxxxxxx> >> >> On VLV, 64MB of system memory was being reserved for stolen >> area, but ~8MB of it was being utilized. >> Increased the utilization of Stolen area by allocating >> User created Frame buffers(only X Tiled) from it. >> User Frame buffers are suitable for allocation from stolen area, >> as its very unlikely that they are not accessed from CPU side. >> And its even more unlikely that the Tiled(X) buffers will be >> accessed directly from the CPU side. And any allocation >> from stolen area is not directly CPU accessible, but >> accessible only through the aperture space. >> With 1080p sized frame buffers (32 bpp), the allocation of 6-7 >> frame buffers itself gives almost the full utilization of >> stolen area. >> >> Signed-off-by: Akash Goel <akash.goel@xxxxxxxxx> >> --- >> drivers/gpu/drm/i915/i915_drv.h | 1 + >> drivers/gpu/drm/i915/i915_gem.c | 21 ++++++++ >> drivers/gpu/drm/i915/i915_gem_stolen.c | 93 ++++++++++++++++++++++++++++++++++ >> 3 files changed, 115 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h >> index 1bcc543..db29537 100644 >> --- a/drivers/gpu/drm/i915/i915_drv.h >> +++ b/drivers/gpu/drm/i915/i915_drv.h >> @@ -2376,6 +2376,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, >> u32 stolen_offset, >> u32 gtt_offset, >> u32 size); >> +void i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj); >> void i915_gem_object_release_stolen(struct drm_i915_gem_object *obj); >> >> /* i915_gem_tiling.c */ >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c >> index 656406d..24f93ef 100644 >> --- a/drivers/gpu/drm/i915/i915_gem.c >> +++ b/drivers/gpu/drm/i915/i915_gem.c >> @@ -3915,6 +3915,27 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj, >> } >> } >> >> + /* Try to allocate the physical space for the GEM object, >> + * representing the User frame buffer, from the stolen area. >> + * But if there is no sufficient free space left in stolen >> + * area, will fallback to shmem. >> + */ >> + if (obj->user_fb == 1) { >> + if (obj->pages == NULL) { >> + if (obj->tiling_mode == I915_TILING_X) { >> + /* Tiled(X) Scanout buffers are more suitable >> + * for allocation from stolen area, as its very >> + * unlikely that they will be accessed directly >> + * from the CPU side and any allocation from >> + * stolen area is not directly CPU accessible, >> + * but accessible only through the aperture >> + * space. >> + */ >> + i915_gem_object_move_to_stolen(obj); >> + } >> + } >> + } > > Neat hack ;-) But I think for upstream we need to address a few more > concerns. Random comments: > > - The vlv patch subject is a bit misleading here since this is about > stolen memory usage in general across all platforms. > > - object_pin is the wrong place to change the backing storage since > someone else could already have pinned the it (e.g. through dma-buf). We > need to do this earlier before calling down into ->get_pages. > > - If we allow opportunistical placement of objects into stolen memory I > think we should also fully support purgeable objects so that we can > dynamically scale to workloads. Or at least to be able to kick out stuff > in case the kernel needs the contiguous memory for something more > important. So if we couldn't find a suitable stolen block we'd scan > through all objects with stolen memory backing and check whether any > purgeable objects could be kicked out. > > - For your usecase, have you tried to simply reduce the stolen area as > much as possible? Our friendly competition on ARM SoCs seems to have > mostly moved away from gfx reserved chunks towards more flexible > approaches like CMA. Giving stolen back to the linux page allocator > should be possible, but I haven't really looked into that yet all that > much ... > > - For upstream I think changing the personality of buffer objects behind > userspace's back is a bit too frisky wrt breaking something. I prefer if > userspace opts-in explicitly by passing a flag to the create ioctl > stating that stolen memory is allowed/preferred for this allocation. > Chris Wilson posted an rfc patch a while back to add a create2 ioctl > (which at the same time also allows us to set the caching, tiling and > any other object parameters). > > - We've had some "fun" last time around we've tried to use stolen more > seriously with scanout buffers being strangely offset/corrupted. > Hopefully this is fixed by your sg->offset patch, but I can't find the > reports nor recall the details right now. > > Cheers, Daniel >> + >> if (!i915_gem_obj_bound(obj, vm)) { >> ret = i915_gem_object_bind_to_vm(obj, vm, alignment, >> map_and_fenceable, >> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c >> index 5cf97d6..29c22f9 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c >> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c >> @@ -29,6 +29,7 @@ >> #include <drm/drmP.h> >> #include <drm/i915_drm.h> >> #include "i915_drv.h" >> +#include <linux/shmem_fs.h> >> >> /* >> * The BIOS typically reserves some of the system's memory for the exclusive >> @@ -370,6 +371,98 @@ i915_gem_object_create_stolen(struct drm_device *dev, u32 size) >> return NULL; >> } >> >> +void >> +i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj) >> +{ >> + struct drm_device *dev = obj->base.dev; >> + struct drm_i915_private *dev_priv = dev->dev_private; >> + struct drm_mm_node *stolen; >> + u32 size = obj->base.size; >> + int ret = 0; >> + >> + if (!IS_VALLEYVIEW(dev)) { >> + return; >> + } >> + >> + if (obj->stolen) { >> + BUG_ON(obj->pages == NULL); >> + return; >> + } >> + >> + if (!drm_mm_initialized(&dev_priv->mm.stolen)) >> + return; >> + >> + if (size == 0) >> + return; >> + >> + /* Check if already shmem space has been allocated for the object >> + * or not. We cannot rely upon on the value of 'pages' field for this. >> + * As even though if the 'pages' field is NULL, it does not actually >> + * indicate that the backing physical space (shmem) is currently not >> + * reserved for the object, as the object may not get purged/truncated >> + * on the calll to 'put_pages_gtt'. >> + */ >> + if (obj->base.filp) { >> + struct inode *inode = file_inode(obj->base.filp); >> + struct shmem_inode_info *info = SHMEM_I(inode); >> + if (!inode) >> + return; >> + spin_lock(&info->lock); >> + /* The alloced field stores how many data pages are >> + * allocated to the file. >> + */ >> + ret = info->alloced; >> + spin_unlock(&info->lock); >> + if (ret > 0) { >> + DRM_DEBUG_DRIVER( >> + "Already shmem space alloced, %d pges\n", ret); >> + return; >> + } >> + } >> + >> + stolen = kzalloc(sizeof(*stolen), GFP_KERNEL); >> + if (!stolen) >> + return; >> + >> + ret = drm_mm_insert_node(&dev_priv->mm.stolen, stolen, size, >> + 4096, DRM_MM_SEARCH_DEFAULT); >> + if (ret) { >> + kfree(stolen); >> + DRM_DEBUG_DRIVER("ran out of stolen space\n"); >> + return; >> + } >> + >> + /* Set up the object to use the stolen memory, >> + * backing store no longer managed by shmem layer */ >> + drm_gem_object_release(&(obj->base)); >> + obj->base.filp = NULL; >> + obj->ops = &i915_gem_object_stolen_ops; >> + >> + obj->pages = i915_pages_create_for_stolen(dev, >> + stolen->start, stolen->size); >> + if (obj->pages == NULL) >> + goto cleanup; >> + >> + i915_gem_object_pin_pages(obj); >> + list_add_tail(&obj->global_list, &dev_priv->mm.unbound_list); >> + obj->has_dma_mapping = true; >> + obj->stolen = stolen; >> + >> + DRM_DEBUG_DRIVER("Obj moved to stolen, ptr = %p, size = %x\n", >> + obj, size); >> + >> + obj->base.read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT; >> + obj->cache_level = HAS_LLC(dev) ? I915_CACHE_LLC : I915_CACHE_NONE; >> + >> + /* No zeroing-out of buffers allocated from stolen area */ >> + return; >> + >> +cleanup: >> + drm_mm_remove_node(stolen); >> + kfree(stolen); >> + return; >> +} >> + >> struct drm_i915_gem_object * >> i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev, >> u32 stolen_offset, >> -- >> 1.8.5.2 >> >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx