> -----Original Message----- > From: Nguyen, Michael H > Sent: Wednesday, December 10, 2014 4:34 PM > To: Bloomfield, Jon > Cc: Daniel Vetter; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > Subject: Re: [PATCH v6 1/5] drm/i915: Implement a framework for > batch buffer pools > > Hi Jon, > > On 12/10/2014 03:06 AM, Bloomfield, Jon wrote: > >> -----Original Message----- > >> From: Intel-gfx [mailto:intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx] On > >> Behalf Of Daniel Vetter > >> Sent: Tuesday, December 09, 2014 1:18 PM > >> To: Nguyen, Michael H > >> Cc: Brad Volkin; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >> Subject: Re: [PATCH v6 1/5] drm/i915: Implement a > >> framework for batch buffer pools > >> > >> On Mon, Dec 08, 2014 at 02:33:46PM -0800, michael.h.nguyen@xxxxxxxxx > >> wrote: > >>> From: Brad Volkin <bradley.d.volkin@xxxxxxxxx> > >>> > >>> This adds a small module for managing a pool of batch buffers. > >>> The only current use case is for the command parser, as described in > >>> the kerneldoc in the patch. The code is simple, but separating it > >>> out makes it easier to change the underlying algorithms and to > >>> extend to future use cases should they arise. > >>> > >>> The interface is simple: init to create an empty pool, fini to clean > >>> it up, get to obtain a new buffer. Note that all buffers are > >>> expected to be inactive before cleaning up the pool. > >>> > >>> Locking is currently based on the caller holding the struct_mutex. > >>> We already do that in the places where we will use the batch pool > >>> for the command parser. > >>> > >>> v2: > >>> - s/BUG_ON/WARN_ON/ for locking assertions > >>> - Remove the cap on pool size > >>> - Switch from alloc/free to init/fini > >>> > >>> v3: > >>> - Idiomatic looping structure in _fini > >>> - Correct handling of purged objects > >>> - Don't return a buffer that's too much larger than needed > >>> > >>> v4: > >>> - Rebased to latest -nightly > >>> > >>> v5: > >>> - Remove _put() function and clean up comments to match > >>> > >>> v6: > >>> - Move purged check inside the loop (danvet, from v4 1/7 feedback) > >>> > >>> v7: > >>> - Use single list instead of two. (Chris W) > >>> - s/active_list/cache_list > >>> - Squashed in debug patches (Chris W) > >>> drm/i915: Add a batch pool debugfs file > >>> > >>> It provides some useful information about the buffers in > >>> the global command parser batch pool. > >>> > >>> v2: rebase on global pool instead of per-ring pools > >>> v3: rebase > >>> > >>> drm/i915: Add batch pool details to i915_gem_objects debugfs > >>> > >>> To better account for the potentially large memory consumption > >>> of the batch pool. > >>> > >>> Issue: VIZ-4719 > >>> Signed-off-by: Brad Volkin <bradley.d.volkin@xxxxxxxxx> > >>> --- > >>> Documentation/DocBook/drm.tmpl | 5 ++ > >>> drivers/gpu/drm/i915/Makefile | 1 + > >>> drivers/gpu/drm/i915/i915_debugfs.c | 71 ++++++++++++++-- > >>> drivers/gpu/drm/i915/i915_drv.h | 21 +++++ > >>> drivers/gpu/drm/i915/i915_gem.c | 1 + > >>> drivers/gpu/drm/i915/i915_gem_batch_pool.c | 132 > >>> +++++++++++++++++++++++++++++ > >>> 6 files changed, 222 insertions(+), 9 deletions(-) create mode > >>> 100644 drivers/gpu/drm/i915/i915_gem_batch_pool.c > >>> > >>> diff --git a/Documentation/DocBook/drm.tmpl > >>> b/Documentation/DocBook/drm.tmpl index 85287cb..022923a 100644 > >>> --- a/Documentation/DocBook/drm.tmpl > >>> +++ b/Documentation/DocBook/drm.tmpl > >>> @@ -4029,6 +4029,11 @@ int num_ioctls;</synopsis> > >>> !Idrivers/gpu/drm/i915/i915_cmd_parser.c > >>> </sect2> > >>> <sect2> > >>> + <title>Batchbuffer Pools</title> > >>> +!Pdrivers/gpu/drm/i915/i915_gem_batch_pool.c batch pool > >>> +!Idrivers/gpu/drm/i915/i915_gem_batch_pool.c > >>> + </sect2> > >>> + <sect2> > >>> <title>Logical Rings, Logical Ring Contexts and > >>> Execlists</title> !Pdrivers/gpu/drm/i915/intel_lrc.c Logical Rings, > >>> Logical Ring Contexts and Execlists > >>> !Idrivers/gpu/drm/i915/intel_lrc.c > >>> diff --git a/drivers/gpu/drm/i915/Makefile > >>> b/drivers/gpu/drm/i915/Makefile index e4083e4..c8dbb37d 100644 > >>> --- a/drivers/gpu/drm/i915/Makefile > >>> +++ b/drivers/gpu/drm/i915/Makefile > >>> @@ -19,6 +19,7 @@ i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o > >>> > >>> # GEM code > >>> i915-y += i915_cmd_parser.o \ > >>> + i915_gem_batch_pool.o \ > >>> i915_gem_context.o \ > >>> i915_gem_render_state.o \ > >>> i915_gem_debug.o \ > >>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > >>> b/drivers/gpu/drm/i915/i915_debugfs.c > >>> index d0e445e..3c3bf98 100644 > >>> --- a/drivers/gpu/drm/i915/i915_debugfs.c > >>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c > >>> @@ -359,6 +359,33 @@ static int per_file_stats(int id, void *ptr, > >>> void > >> *data) > >>> return 0; > >>> } > >>> > >>> +#define print_file_stats(m, name, stats) \ > >>> + seq_printf(m, "%s: %u objects, %zu bytes (%zu active, %zu > >>> +inactive, > >> %zu global, %zu shared, %zu unbound)\n", \ > >>> + name, \ > >>> + stats.count, \ > >>> + stats.total, \ > >>> + stats.active, \ > >>> + stats.inactive, \ > >>> + stats.global, \ > >>> + stats.shared, \ > >>> + stats.unbound) > > > > "print_file_stats" might be better named "print_obj_stats" or similar. > > As it stands there is nothing in its name to suggest its purpose is to describe > an object. > I guess the fnc name 'print_file_stats' was chosen b/c it takes a 'stats' > parameter of type 'struct file_stats' and prints the members. > Seems fair to leave it I think as its generic. Ok, agreed. > > > > > >>> + > >>> +static void print_batch_pool_stats(struct seq_file *m, > >>> + struct drm_i915_private *dev_priv) { > >>> + struct drm_i915_gem_object *obj; > >>> + struct file_stats stats; > >>> + > >>> + memset(&stats, 0, sizeof(stats)); > >>> + > >>> + list_for_each_entry(obj, > >>> + &dev_priv->mm.batch_pool.cache_list, > >>> + batch_pool_list) > >>> + per_file_stats(0, obj, &stats); > > > > Shouldn't we be holding the batch_pool lock here ? > Think we're okay here. The callers of print_batch_pool_stats() hold > dev->struct_mutex. > > Thanks, > -Mike But then, why do we take the batch_pool lock in i915_gem_batch_pool_info() ? > > > > > >>> + > >>> + print_file_stats(m, "batch pool", stats); } > >>> + > >>> #define count_vmas(list, member) do { \ > >>> list_for_each_entry(vma, list, member) { \ > >>> size += i915_gem_obj_ggtt_size(vma->obj); \ @@ -441,6 > >> +468,9 @@ > >>> static int i915_gem_object_info(struct seq_file *m, void* data) > >>> dev_priv->gtt.mappable_end - dev_priv->gtt.base.start); > >>> > >>> seq_putc(m, '\n'); > >>> + print_batch_pool_stats(m, dev_priv); > >>> + > >>> + seq_putc(m, '\n'); > >>> list_for_each_entry_reverse(file, &dev->filelist, lhead) { > >>> struct file_stats stats; > >>> struct task_struct *task; > >>> @@ -458,15 +488,7 @@ static int i915_gem_object_info(struct seq_file > >>> *m, > >> void* data) > >>> */ > >>> rcu_read_lock(); > >>> task = pid_task(file->pid, PIDTYPE_PID); > >>> - seq_printf(m, "%s: %u objects, %zu bytes (%zu active, %zu > >> inactive, %zu global, %zu shared, %zu unbound)\n", > >>> - task ? task->comm : "<unknown>", > >>> - stats.count, > >>> - stats.total, > >>> - stats.active, > >>> - stats.inactive, > >>> - stats.global, > >>> - stats.shared, > >>> - stats.unbound); > >>> + print_file_stats(m, task ? task->comm : "<unknown>", stats); > >>> rcu_read_unlock(); > >>> } > >>> > >>> @@ -583,6 +605,36 @@ static int i915_gem_pageflip_info(struct > >>> seq_file > >> *m, void *data) > >>> return 0; > >>> } > >>> > >>> +static int i915_gem_batch_pool_info(struct seq_file *m, void *data) { > >>> + struct drm_info_node *node = m->private; > >>> + struct drm_device *dev = node->minor->dev; > >>> + struct drm_i915_private *dev_priv = dev->dev_private; > >>> + struct drm_i915_gem_object *obj; > >>> + int count = 0; > >>> + int ret; > >>> + > >>> + ret = mutex_lock_interruptible(&dev->struct_mutex); > >>> + if (ret) > >>> + return ret; > >>> + > >>> + seq_puts(m, "cache:\n"); > >>> + list_for_each_entry(obj, > >>> + &dev_priv->mm.batch_pool.cache_list, > >>> + batch_pool_list) { > >>> + seq_puts(m, " "); > >>> + describe_obj(m, obj); > >>> + seq_putc(m, '\n'); > >>> + count++; > >>> + } > >>> + > >>> + seq_printf(m, "total: %d\n", count); > >>> + > >>> + mutex_unlock(&dev->struct_mutex); > >>> + > >>> + return 0; > >>> +} > >>> + > >>> static int i915_gem_request_info(struct seq_file *m, void *data) { > >>> struct drm_info_node *node = m->private; @@ -4324,6 +4376,7 @@ > >>> static const struct drm_info_list i915_debugfs_list[] = { > >>> {"i915_gem_hws_blt", i915_hws_info, 0, (void *)BCS}, > >>> {"i915_gem_hws_bsd", i915_hws_info, 0, (void *)VCS}, > >>> {"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS}, > >>> + {"i915_gem_batch_pool", i915_gem_batch_pool_info, 0}, > >>> {"i915_frequency_info", i915_frequency_info, 0}, > >>> {"i915_drpc_info", i915_drpc_info, 0}, > >>> {"i915_emon_status", i915_emon_status, 0}, diff --git > >>> a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > >>> index 11e85cb..f3e27e9 100644 > >>> --- a/drivers/gpu/drm/i915/i915_drv.h > >>> +++ b/drivers/gpu/drm/i915/i915_drv.h > >>> @@ -1129,6 +1129,11 @@ struct intel_l3_parity { > >>> int which_slice; > >>> }; > >>> > >>> +struct i915_gem_batch_pool { > >>> + struct drm_device *dev; > >>> + struct list_head cache_list; > >>> +}; > >>> + > >>> struct i915_gem_mm { > >>> /** Memory allocator for GTT stolen memory */ > >>> struct drm_mm stolen; > >>> @@ -1142,6 +1147,13 @@ struct i915_gem_mm { > >>> */ > >>> struct list_head unbound_list; > >>> > >>> + /* > >>> + * A pool of objects to use as shadow copies of client batch buffers > >>> + * when the command parser is enabled. Prevents the client from > >>> + * modifying the batch contents after software parsing. > >>> + */ > >>> + struct i915_gem_batch_pool batch_pool; > >>> + > >>> /** Usable portion of the GTT for GEM */ > >>> unsigned long stolen_base; /* limited to low memory (32-bit) */ > >>> > >>> @@ -1872,6 +1884,8 @@ struct drm_i915_gem_object { > >>> /** Used in execbuf to temporarily hold a ref */ > >>> struct list_head obj_exec_link; > >>> > >>> + struct list_head batch_pool_list; > >>> + > >>> /** > >>> * This is set if the object is on the active lists (has pending > >>> * rendering and so a non-zero seqno), and is not set if it i s > >>> on @@ -2876,6 +2890,13 @@ void i915_destroy_error_state(struct > >> drm_device > >>> *dev); void i915_get_extra_instdone(struct drm_device *dev, > >>> uint32_t *instdone); const char *i915_cache_level_str(struct > >>> drm_i915_private *i915, int type); > >>> > >>> +/* i915_gem_batch_pool.c */ > >>> +void i915_gem_batch_pool_init(struct drm_device *dev, > >>> + struct i915_gem_batch_pool *pool); void > >>> +i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool); struct > >>> +drm_i915_gem_object* i915_gem_batch_pool_get(struct > >>> +i915_gem_batch_pool *pool, size_t size); > >>> + > >>> /* i915_cmd_parser.c */ > >>> int i915_cmd_parser_get_version(void); > >>> int i915_cmd_parser_init_ring(struct intel_engine_cs *ring); diff > >>> --git a/drivers/gpu/drm/i915/i915_gem.c > >>> b/drivers/gpu/drm/i915/i915_gem.c index de241eb..2f14ae1 100644 > >>> --- a/drivers/gpu/drm/i915/i915_gem.c > >>> +++ b/drivers/gpu/drm/i915/i915_gem.c > >>> @@ -4384,6 +4384,7 @@ void i915_gem_object_init(struct > >> drm_i915_gem_object *obj, > >>> INIT_LIST_HEAD(&obj->ring_list); > >>> INIT_LIST_HEAD(&obj->obj_exec_link); > >>> INIT_LIST_HEAD(&obj->vma_list); > >>> + INIT_LIST_HEAD(&obj->batch_pool_list); > >>> > >>> obj->ops = ops; > >>> > >>> diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c > >>> b/drivers/gpu/drm/i915/i915_gem_batch_pool.c > >>> new file mode 100644 > >>> index 0000000..e9349e3 > >>> --- /dev/null > >>> +++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c > >>> @@ -0,0 +1,132 @@ > >>> +/* > >>> + * Copyright © 2014 Intel Corporation > >>> + * > >>> + * Permission is hereby granted, free of charge, to any person > >>> +obtaining a > >>> + * copy of this software and associated documentation files (the > >>> +"Software"), > >>> + * to deal in the Software without restriction, including without > >>> +limitation > >>> + * the rights to use, copy, modify, merge, publish, distribute, > >>> +sublicense, > >>> + * and/or sell copies of the Software, and to permit persons to > >>> +whom the > >>> + * Software is furnished to do so, subject to the following conditions: > >>> + * > >>> + * The above copyright notice and this permission notice (including > >>> +the next > >>> + * paragraph) shall be included in all copies or substantial > >>> +portions of the > >>> + * Software. > >>> + * > >>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY > >> KIND, > >>> +EXPRESS OR > >>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > >>> +MERCHANTABILITY, > >>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN > >> NO EVENT > >>> +SHALL > >>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, > >> DAMAGES > >>> +OR OTHER > >>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR > >> OTHERWISE, > >>> +ARISING > >>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE > USE > >> OR > >>> +OTHER DEALINGS > >>> + * IN THE SOFTWARE. > >>> + * > >>> + */ > >>> + > >>> +#include "i915_drv.h" > >>> + > >>> +/** > >>> + * DOC: batch pool > >>> + * > >>> + * In order to submit batch buffers as 'secure', the software > >>> +command parser > >>> + * must ensure that a batch buffer cannot be modified after parsing. > >>> +It does > >>> + * this by copying the user provided batch buffer contents to a > >>> +kernel owned > >>> + * buffer from which the hardware will actually execute, and by > >>> +carefully > >>> + * managing the address space bindings for such buffers. > >>> + * > >>> + * The batch pool framework provides a mechanism for the driver to > >>> +manage a > >>> + * set of scratch buffers to use for this purpose. The framework > >>> +can be > >>> + * extended to support other uses cases should they arise. > >>> + */ > >>> + > >>> +/** > >>> + * i915_gem_batch_pool_init() - initialize a batch buffer pool > >>> + * @dev: the drm device > >>> + * @pool: the batch buffer pool > >>> + */ > >>> +void i915_gem_batch_pool_init(struct drm_device *dev, > >>> + struct i915_gem_batch_pool *pool) { > >>> + pool->dev = dev; > >>> + INIT_LIST_HEAD(&pool->cache_list); > >>> +} > >>> + > >>> +/** > >>> + * i915_gem_batch_pool_fini() - clean up a batch buffer pool > >>> + * @pool: the pool to clean up > >>> + * > >>> + * Note: Callers must hold the struct_mutex. > >>> + */ > >>> +void i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool) { > >>> + WARN_ON(!mutex_is_locked(&pool->dev->struct_mutex)); > >>> + > >>> + while (!list_empty(&pool->cache_list)) { > >>> + struct drm_i915_gem_object *obj = > >>> + list_first_entry(&pool->cache_list, > >>> + struct drm_i915_gem_object, > >>> + batch_pool_list); > >>> + > >>> + WARN_ON(obj->active); > >>> + > >>> + list_del_init(&obj->batch_pool_list); > >>> + drm_gem_object_unreference(&obj->base); > >>> + } > >>> +} > >>> + > >>> +/** > >>> + * i915_gem_batch_pool_get() - select a buffer from the pool > >>> + * @pool: the batch buffer pool > >>> + * @size: the minimum desired size of the returned buffer > >>> + * > >>> + * Finds or allocates a batch buffer in the pool with at least the > >>> +requested > >>> + * size. The caller is responsible for any domain, active/inactive, > >>> +or > >>> + * purgeability management for the returned buffer. > >>> + * > >>> + * Note: Callers must hold the struct_mutex > >>> + * > >>> + * Return: the selected batch buffer object */ struct > >>> +drm_i915_gem_object * i915_gem_batch_pool_get(struct > >>> +i915_gem_batch_pool *pool, > >>> + size_t size) > >>> +{ > >>> + struct drm_i915_gem_object *obj = NULL; > >>> + struct drm_i915_gem_object *tmp, *next; > >>> + > >>> + WARN_ON(!mutex_is_locked(&pool->dev->struct_mutex)); > >>> + > >>> + list_for_each_entry_safe(tmp, next, > >>> + &pool->cache_list, batch_pool_list) { > >>> + > >>> + if (tmp->active) > >>> + continue; > >>> + > >>> + /* While we're looping, do some clean up */ > >>> + if (tmp->madv == __I915_MADV_PURGED) { > >>> + list_del(&tmp->batch_pool_list); > >>> + drm_gem_object_unreference(&tmp->base); > >>> + continue; > >>> + } > >>> + > >>> + /* > >>> + * Select a buffer that is at least as big as needed > >>> + * but not 'too much' bigger. A better way to do this > >>> + * might be to bucket the pool objects based on size. > >>> + */ > >>> + if (tmp->base.size >= size && > >>> + tmp->base.size <= (2 * size)) { > >>> + obj = tmp; > >>> + break; > >>> + } > >>> + } > >>> + > >>> + if (!obj) { > >>> + obj = i915_gem_alloc_object(pool->dev, size); > >>> + if (!obj) > >>> + return ERR_PTR(-ENOMEM); > >>> + > >>> + list_add_tail(&obj->batch_pool_list, &pool->cache_list); > >>> + } > >> > >> Shouldn't we have a else list_move_tail here to keep the list in lru order? > >> -Daniel > >> > >>> + > >>> + return obj; > >>> +} > >>> -- > >>> 1.9.1 > >>> > >>> _______________________________________________ > >>> Intel-gfx mailing list > >>> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx > >> > >> -- > >> Daniel Vetter > >> Software Engineer, Intel Corporation > >> +41 (0) 79 365 57 48 - http://blog.ffwll.ch > >> _______________________________________________ > >> Intel-gfx mailing list > >> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx