Re: [PATCH v6 1/5] drm/i915: Implement a framework for batch buffer pools

"Bloomfield, Jon" <jon.bloomfield@xxxxxxxxx> · Wed, 10 Dec 2014 16:41:13 +0000

> -----Original Message-----
> From: Nguyen, Michael H
> Sent: Wednesday, December 10, 2014 4:34 PM
> To: Bloomfield, Jon
> Cc: Daniel Vetter; intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> Subject: Re:  [PATCH v6 1/5] drm/i915: Implement a framework for
> batch buffer pools
> 
> Hi Jon,
> 
> On 12/10/2014 03:06 AM, Bloomfield, Jon wrote:
> >> -----Original Message-----
> >> From: Intel-gfx [mailto:intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx] On
> >> Behalf Of Daniel Vetter
> >> Sent: Tuesday, December 09, 2014 1:18 PM
> >> To: Nguyen, Michael H
> >> Cc: Brad Volkin; intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> >> Subject: Re:  [PATCH v6 1/5] drm/i915: Implement a
> >> framework for batch buffer pools
> >>
> >> On Mon, Dec 08, 2014 at 02:33:46PM -0800, michael.h.nguyen@xxxxxxxxx
> >> wrote:
> >>> From: Brad Volkin <bradley.d.volkin@xxxxxxxxx>
> >>>
> >>> This adds a small module for managing a pool of batch buffers.
> >>> The only current use case is for the command parser, as described in
> >>> the kerneldoc in the patch. The code is simple, but separating it
> >>> out makes it easier to change the underlying algorithms and to
> >>> extend to future use cases should they arise.
> >>>
> >>> The interface is simple: init to create an empty pool, fini to clean
> >>> it up, get to obtain a new buffer. Note that all buffers are
> >>> expected to be inactive before cleaning up the pool.
> >>>
> >>> Locking is currently based on the caller holding the struct_mutex.
> >>> We already do that in the places where we will use the batch pool
> >>> for the command parser.
> >>>
> >>> v2:
> >>> - s/BUG_ON/WARN_ON/ for locking assertions
> >>> - Remove the cap on pool size
> >>> - Switch from alloc/free to init/fini
> >>>
> >>> v3:
> >>> - Idiomatic looping structure in _fini
> >>> - Correct handling of purged objects
> >>> - Don't return a buffer that's too much larger than needed
> >>>
> >>> v4:
> >>> - Rebased to latest -nightly
> >>>
> >>> v5:
> >>> - Remove _put() function and clean up comments to match
> >>>
> >>> v6:
> >>> - Move purged check inside the loop (danvet, from v4 1/7 feedback)
> >>>
> >>> v7:
> >>> - Use single list instead of two. (Chris W)
> >>> - s/active_list/cache_list
> >>> - Squashed in debug patches (Chris W)
> >>>    drm/i915: Add a batch pool debugfs file
> >>>
> >>>    It provides some useful information about the buffers in
> >>>    the global command parser batch pool.
> >>>
> >>>    v2: rebase on global pool instead of per-ring pools
> >>>    v3: rebase
> >>>
> >>>    drm/i915: Add batch pool details to i915_gem_objects debugfs
> >>>
> >>>    To better account for the potentially large memory consumption
> >>>    of the batch pool.
> >>>
> >>> Issue: VIZ-4719
> >>> Signed-off-by: Brad Volkin <bradley.d.volkin@xxxxxxxxx>
> >>> ---
> >>>   Documentation/DocBook/drm.tmpl             |   5 ++
> >>>   drivers/gpu/drm/i915/Makefile              |   1 +
> >>>   drivers/gpu/drm/i915/i915_debugfs.c        |  71 ++++++++++++++--
> >>>   drivers/gpu/drm/i915/i915_drv.h            |  21 +++++
> >>>   drivers/gpu/drm/i915/i915_gem.c            |   1 +
> >>>   drivers/gpu/drm/i915/i915_gem_batch_pool.c | 132
> >>> +++++++++++++++++++++++++++++
> >>>   6 files changed, 222 insertions(+), 9 deletions(-)  create mode
> >>> 100644 drivers/gpu/drm/i915/i915_gem_batch_pool.c
> >>>
> >>> diff --git a/Documentation/DocBook/drm.tmpl
> >>> b/Documentation/DocBook/drm.tmpl index 85287cb..022923a 100644
> >>> --- a/Documentation/DocBook/drm.tmpl
> >>> +++ b/Documentation/DocBook/drm.tmpl
> >>> @@ -4029,6 +4029,11 @@ int num_ioctls;</synopsis>
> >>> !Idrivers/gpu/drm/i915/i915_cmd_parser.c
> >>>         </sect2>
> >>>         <sect2>
> >>> +        <title>Batchbuffer Pools</title>
> >>> +!Pdrivers/gpu/drm/i915/i915_gem_batch_pool.c batch pool
> >>> +!Idrivers/gpu/drm/i915/i915_gem_batch_pool.c
> >>> +      </sect2>
> >>> +      <sect2>
> >>>           <title>Logical Rings, Logical Ring Contexts and
> >>> Execlists</title>  !Pdrivers/gpu/drm/i915/intel_lrc.c Logical Rings,
> >>> Logical Ring Contexts and Execlists
> >>> !Idrivers/gpu/drm/i915/intel_lrc.c
> >>> diff --git a/drivers/gpu/drm/i915/Makefile
> >>> b/drivers/gpu/drm/i915/Makefile index e4083e4..c8dbb37d 100644
> >>> --- a/drivers/gpu/drm/i915/Makefile
> >>> +++ b/drivers/gpu/drm/i915/Makefile
> >>> @@ -19,6 +19,7 @@ i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o
> >>>
> >>>   # GEM code
> >>>   i915-y += i915_cmd_parser.o \
> >>> +	  i915_gem_batch_pool.o \
> >>>   	  i915_gem_context.o \
> >>>   	  i915_gem_render_state.o \
> >>>   	  i915_gem_debug.o \
> >>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> >>> b/drivers/gpu/drm/i915/i915_debugfs.c
> >>> index d0e445e..3c3bf98 100644
> >>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> >>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> >>> @@ -359,6 +359,33 @@ static int per_file_stats(int id, void *ptr,
> >>> void
> >> *data)
> >>>   	return 0;
> >>>   }
> >>>
> >>> +#define print_file_stats(m, name, stats) \
> >>> +	seq_printf(m, "%s: %u objects, %zu bytes (%zu active, %zu
> >>> +inactive,
> >> %zu global, %zu shared, %zu unbound)\n", \
> >>> +		   name, \
> >>> +		   stats.count, \
> >>> +		   stats.total, \
> >>> +		   stats.active, \
> >>> +		   stats.inactive, \
> >>> +		   stats.global, \
> >>> +		   stats.shared, \
> >>> +		   stats.unbound)
> >
> > "print_file_stats" might be better named "print_obj_stats" or similar.
> > As it stands there is nothing in its name to suggest its purpose is to describe
> an object.
> I guess the fnc name 'print_file_stats' was chosen b/c it takes a 'stats'
> parameter of type 'struct file_stats' and prints the members.
> Seems fair to leave it I think as its generic.

Ok, agreed.

> >
> >
> >>> +
> >>> +static void print_batch_pool_stats(struct seq_file *m,
> >>> +				   struct drm_i915_private *dev_priv) {
> >>> +	struct drm_i915_gem_object *obj;
> >>> +	struct file_stats stats;
> >>> +
> >>> +	memset(&stats, 0, sizeof(stats));
> >>> +
> >>> +	list_for_each_entry(obj,
> >>> +			    &dev_priv->mm.batch_pool.cache_list,
> >>> +			    batch_pool_list)
> >>> +		per_file_stats(0, obj, &stats);
> >
> > Shouldn't we be holding the batch_pool lock here ?
> Think we're okay here. The callers of print_batch_pool_stats() hold
> dev->struct_mutex.
> 
> Thanks,
> -Mike

But then, why do we take the batch_pool lock in i915_gem_batch_pool_info() ?

> >
> >
> >>> +
> >>> +	print_file_stats(m, "batch pool", stats); }
> >>> +
> >>>   #define count_vmas(list, member) do { \
> >>>   	list_for_each_entry(vma, list, member) { \
> >>>   		size += i915_gem_obj_ggtt_size(vma->obj); \ @@ -441,6
> >> +468,9 @@
> >>> static int i915_gem_object_info(struct seq_file *m, void* data)
> >>>   		   dev_priv->gtt.mappable_end - dev_priv->gtt.base.start);
> >>>
> >>>   	seq_putc(m, '\n');
> >>> +	print_batch_pool_stats(m, dev_priv);
> >>> +
> >>> +	seq_putc(m, '\n');
> >>>   	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
> >>>   		struct file_stats stats;
> >>>   		struct task_struct *task;
> >>> @@ -458,15 +488,7 @@ static int i915_gem_object_info(struct seq_file
> >>> *m,
> >> void* data)
> >>>   		 */
> >>>   		rcu_read_lock();
> >>>   		task = pid_task(file->pid, PIDTYPE_PID);
> >>> -		seq_printf(m, "%s: %u objects, %zu bytes (%zu active, %zu
> >> inactive, %zu global, %zu shared, %zu unbound)\n",
> >>> -			   task ? task->comm : "<unknown>",
> >>> -			   stats.count,
> >>> -			   stats.total,
> >>> -			   stats.active,
> >>> -			   stats.inactive,
> >>> -			   stats.global,
> >>> -			   stats.shared,
> >>> -			   stats.unbound);
> >>> +		print_file_stats(m, task ? task->comm : "<unknown>", stats);
> >>>   		rcu_read_unlock();
> >>>   	}
> >>>
> >>> @@ -583,6 +605,36 @@ static int i915_gem_pageflip_info(struct
> >>> seq_file
> >> *m, void *data)
> >>>   	return 0;
> >>>   }
> >>>
> >>> +static int i915_gem_batch_pool_info(struct seq_file *m, void *data) {
> >>> +	struct drm_info_node *node = m->private;
> >>> +	struct drm_device *dev = node->minor->dev;
> >>> +	struct drm_i915_private *dev_priv = dev->dev_private;
> >>> +	struct drm_i915_gem_object *obj;
> >>> +	int count = 0;
> >>> +	int ret;
> >>> +
> >>> +	ret = mutex_lock_interruptible(&dev->struct_mutex);
> >>> +	if (ret)
> >>> +		return ret;
> >>> +
> >>> +	seq_puts(m, "cache:\n");
> >>> +	list_for_each_entry(obj,
> >>> +			    &dev_priv->mm.batch_pool.cache_list,
> >>> +			    batch_pool_list) {
> >>> +		seq_puts(m, "   ");
> >>> +		describe_obj(m, obj);
> >>> +		seq_putc(m, '\n');
> >>> +		count++;
> >>> +	}
> >>> +
> >>> +	seq_printf(m, "total: %d\n", count);
> >>> +
> >>> +	mutex_unlock(&dev->struct_mutex);
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>>   static int i915_gem_request_info(struct seq_file *m, void *data)  {
> >>>   	struct drm_info_node *node = m->private; @@ -4324,6 +4376,7 @@
> >>> static const struct drm_info_list i915_debugfs_list[] = {
> >>>   	{"i915_gem_hws_blt", i915_hws_info, 0, (void *)BCS},
> >>>   	{"i915_gem_hws_bsd", i915_hws_info, 0, (void *)VCS},
> >>>   	{"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS},
> >>> +	{"i915_gem_batch_pool", i915_gem_batch_pool_info, 0},
> >>>   	{"i915_frequency_info", i915_frequency_info, 0},
> >>>   	{"i915_drpc_info", i915_drpc_info, 0},
> >>>   	{"i915_emon_status", i915_emon_status, 0}, diff --git
> >>> a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>> index 11e85cb..f3e27e9 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>> @@ -1129,6 +1129,11 @@ struct intel_l3_parity {
> >>>   	int which_slice;
> >>>   };
> >>>
> >>> +struct i915_gem_batch_pool {
> >>> +	struct drm_device *dev;
> >>> +	struct list_head cache_list;
> >>> +};
> >>> +
> >>>   struct i915_gem_mm {
> >>>   	/** Memory allocator for GTT stolen memory */
> >>>   	struct drm_mm stolen;
> >>> @@ -1142,6 +1147,13 @@ struct i915_gem_mm {
> >>>   	 */
> >>>   	struct list_head unbound_list;
> >>>
> >>> +	/*
> >>> +	 * A pool of objects to use as shadow copies of client batch buffers
> >>> +	 * when the command parser is enabled. Prevents the client from
> >>> +	 * modifying the batch contents after software parsing.
> >>> +	 */
> >>> +	struct i915_gem_batch_pool batch_pool;
> >>> +
> >>>   	/** Usable portion of the GTT for GEM */
> >>>   	unsigned long stolen_base; /* limited to low memory (32-bit) */
> >>>
> >>> @@ -1872,6 +1884,8 @@ struct drm_i915_gem_object {
> >>>   	/** Used in execbuf to temporarily hold a ref */
> >>>   	struct list_head obj_exec_link;
> >>>
> >>> +	struct list_head batch_pool_list;
> >>> +
> >>>   	/**
> >>>   	 * This is set if the object is on the active lists (has pending
> >>>   	 * rendering and so a non-zero seqno), and is not set if it i s
> >>> on @@ -2876,6 +2890,13 @@ void i915_destroy_error_state(struct
> >> drm_device
> >>> *dev);  void i915_get_extra_instdone(struct drm_device *dev,
> >>> uint32_t *instdone);  const char *i915_cache_level_str(struct
> >>> drm_i915_private *i915, int type);
> >>>
> >>> +/* i915_gem_batch_pool.c */
> >>> +void i915_gem_batch_pool_init(struct drm_device *dev,
> >>> +			      struct i915_gem_batch_pool *pool); void
> >>> +i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool); struct
> >>> +drm_i915_gem_object* i915_gem_batch_pool_get(struct
> >>> +i915_gem_batch_pool *pool, size_t size);
> >>> +
> >>>   /* i915_cmd_parser.c */
> >>>   int i915_cmd_parser_get_version(void);
> >>>   int i915_cmd_parser_init_ring(struct intel_engine_cs *ring); diff
> >>> --git a/drivers/gpu/drm/i915/i915_gem.c
> >>> b/drivers/gpu/drm/i915/i915_gem.c index de241eb..2f14ae1 100644
> >>> --- a/drivers/gpu/drm/i915/i915_gem.c
> >>> +++ b/drivers/gpu/drm/i915/i915_gem.c
> >>> @@ -4384,6 +4384,7 @@ void i915_gem_object_init(struct
> >> drm_i915_gem_object *obj,
> >>>   	INIT_LIST_HEAD(&obj->ring_list);
> >>>   	INIT_LIST_HEAD(&obj->obj_exec_link);
> >>>   	INIT_LIST_HEAD(&obj->vma_list);
> >>> +	INIT_LIST_HEAD(&obj->batch_pool_list);
> >>>
> >>>   	obj->ops = ops;
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c
> >>> b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
> >>> new file mode 100644
> >>> index 0000000..e9349e3
> >>> --- /dev/null
> >>> +++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
> >>> @@ -0,0 +1,132 @@
> >>> +/*
> >>> + * Copyright © 2014 Intel Corporation
> >>> + *
> >>> + * Permission is hereby granted, free of charge, to any person
> >>> +obtaining a
> >>> + * copy of this software and associated documentation files (the
> >>> +"Software"),
> >>> + * to deal in the Software without restriction, including without
> >>> +limitation
> >>> + * the rights to use, copy, modify, merge, publish, distribute,
> >>> +sublicense,
> >>> + * and/or sell copies of the Software, and to permit persons to
> >>> +whom the
> >>> + * Software is furnished to do so, subject to the following conditions:
> >>> + *
> >>> + * The above copyright notice and this permission notice (including
> >>> +the next
> >>> + * paragraph) shall be included in all copies or substantial
> >>> +portions of the
> >>> + * Software.
> >>> + *
> >>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> >> KIND,
> >>> +EXPRESS OR
> >>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> >>> +MERCHANTABILITY,
> >>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN
> >> NO EVENT
> >>> +SHALL
> >>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> >> DAMAGES
> >>> +OR OTHER
> >>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> >> OTHERWISE,
> >>> +ARISING
> >>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
> USE
> >> OR
> >>> +OTHER DEALINGS
> >>> + * IN THE SOFTWARE.
> >>> + *
> >>> + */
> >>> +
> >>> +#include "i915_drv.h"
> >>> +
> >>> +/**
> >>> + * DOC: batch pool
> >>> + *
> >>> + * In order to submit batch buffers as 'secure', the software
> >>> +command parser
> >>> + * must ensure that a batch buffer cannot be modified after parsing.
> >>> +It does
> >>> + * this by copying the user provided batch buffer contents to a
> >>> +kernel owned
> >>> + * buffer from which the hardware will actually execute, and by
> >>> +carefully
> >>> + * managing the address space bindings for such buffers.
> >>> + *
> >>> + * The batch pool framework provides a mechanism for the driver to
> >>> +manage a
> >>> + * set of scratch buffers to use for this purpose. The framework
> >>> +can be
> >>> + * extended to support other uses cases should they arise.
> >>> + */
> >>> +
> >>> +/**
> >>> + * i915_gem_batch_pool_init() - initialize a batch buffer pool
> >>> + * @dev: the drm device
> >>> + * @pool: the batch buffer pool
> >>> + */
> >>> +void i915_gem_batch_pool_init(struct drm_device *dev,
> >>> +			      struct i915_gem_batch_pool *pool) {
> >>> +	pool->dev = dev;
> >>> +	INIT_LIST_HEAD(&pool->cache_list);
> >>> +}
> >>> +
> >>> +/**
> >>> + * i915_gem_batch_pool_fini() - clean up a batch buffer pool
> >>> + * @pool: the pool to clean up
> >>> + *
> >>> + * Note: Callers must hold the struct_mutex.
> >>> + */
> >>> +void i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool) {
> >>> +	WARN_ON(!mutex_is_locked(&pool->dev->struct_mutex));
> >>> +
> >>> +	while (!list_empty(&pool->cache_list)) {
> >>> +		struct drm_i915_gem_object *obj =
> >>> +			list_first_entry(&pool->cache_list,
> >>> +					 struct drm_i915_gem_object,
> >>> +					 batch_pool_list);
> >>> +
> >>> +		WARN_ON(obj->active);
> >>> +
> >>> +		list_del_init(&obj->batch_pool_list);
> >>> +		drm_gem_object_unreference(&obj->base);
> >>> +	}
> >>> +}
> >>> +
> >>> +/**
> >>> + * i915_gem_batch_pool_get() - select a buffer from the pool
> >>> + * @pool: the batch buffer pool
> >>> + * @size: the minimum desired size of the returned buffer
> >>> + *
> >>> + * Finds or allocates a batch buffer in the pool with at least the
> >>> +requested
> >>> + * size. The caller is responsible for any domain, active/inactive,
> >>> +or
> >>> + * purgeability management for the returned buffer.
> >>> + *
> >>> + * Note: Callers must hold the struct_mutex
> >>> + *
> >>> + * Return: the selected batch buffer object  */ struct
> >>> +drm_i915_gem_object * i915_gem_batch_pool_get(struct
> >>> +i915_gem_batch_pool *pool,
> >>> +			size_t size)
> >>> +{
> >>> +	struct drm_i915_gem_object *obj = NULL;
> >>> +	struct drm_i915_gem_object *tmp, *next;
> >>> +
> >>> +	WARN_ON(!mutex_is_locked(&pool->dev->struct_mutex));
> >>> +
> >>> +	list_for_each_entry_safe(tmp, next,
> >>> +			&pool->cache_list, batch_pool_list) {
> >>> +
> >>> +		if (tmp->active)
> >>> +			continue;
> >>> +
> >>> +		/* While we're looping, do some clean up */
> >>> +		if (tmp->madv == __I915_MADV_PURGED) {
> >>> +			list_del(&tmp->batch_pool_list);
> >>> +			drm_gem_object_unreference(&tmp->base);
> >>> +			continue;
> >>> +		}
> >>> +
> >>> +		/*
> >>> +		 * Select a buffer that is at least as big as needed
> >>> +		 * but not 'too much' bigger. A better way to do this
> >>> +		 * might be to bucket the pool objects based on size.
> >>> +		 */
> >>> +		if (tmp->base.size >= size &&
> >>> +		    tmp->base.size <= (2 * size)) {
> >>> +			obj = tmp;
> >>> +			break;
> >>> +		}
> >>> +	}
> >>> +
> >>> +	if (!obj) {
> >>> +		obj = i915_gem_alloc_object(pool->dev, size);
> >>> +		if (!obj)
> >>> +			return ERR_PTR(-ENOMEM);
> >>> +
> >>> +		list_add_tail(&obj->batch_pool_list, &pool->cache_list);
> >>> +	}
> >>
> >> Shouldn't we have a else list_move_tail here to keep the list in lru order?
> >> -Daniel
> >>
> >>> +
> >>> +	return obj;
> >>> +}
> >>> --
> >>> 1.9.1
> >>>
> >>> _______________________________________________
> >>> Intel-gfx mailing list
> >>> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> >>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >>
> >> --
> >> Daniel Vetter
> >> Software Engineer, Intel Corporation
> >> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >> _______________________________________________
> >> Intel-gfx mailing list
> >> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx