Re: [RFC 4/5] drm/i915: Expose per-engine client busyness

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Tvrtko Ursulin (2018-02-15 09:41:53)
> 
> On 14/02/2018 19:17, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-02-14 18:50:34)
> >> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> >>
> >> Expose per-client and per-engine busyness under the previously added sysfs
> >> client root.
> >>
> >> The new files are one per-engine instance and located under the 'busy'
> >> directory.
> >>
> >> Each contains a monotonically increasing nano-second resolution times each
> >> client's jobs were executing on the GPU.
> >>
> >> $ cat /sys/class/drm/card0/clients/5/busy/rcs0
> >> 32516602
> >>
> >> This data can serve as an interface to implement a top like utility for
> >> GPU jobs. For instance I have prototyped a tool in IGT which produces
> >> periodic output like:
> >>
> >> neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
> >>       Xorg[  5664]:  rcs0:  31.16%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
> >>      xfwm4[  5727]:  rcs0:   0.00%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
> >>
> >> This tools can also be extended to use the i915 PMU and show overall engine
> >> busyness, and engine loads using the queue depth metric.
> >>
> >> v2: Use intel_context_engine_get_busy_time.
> >> v3: New directory structure.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> >> ---
> >>   drivers/gpu/drm/i915/i915_drv.h |  8 ++++
> >>   drivers/gpu/drm/i915/i915_gem.c | 86 +++++++++++++++++++++++++++++++++++++++--
> >>   2 files changed, 91 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >> index 372d13cb2472..d6b2883b42fe 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -315,6 +315,12 @@ struct drm_i915_private;
> >>   struct i915_mm_struct;
> >>   struct i915_mmu_object;
> >>   
> >> +struct i915_engine_busy_attribute {
> >> +       struct device_attribute attr;
> >> +       struct drm_i915_file_private *file_priv;
> >> +       struct intel_engine_cs *engine;
> >> +};
> >> +
> >>   struct drm_i915_file_private {
> >>          struct drm_i915_private *dev_priv;
> >>          struct drm_file *file;
> >> @@ -350,10 +356,12 @@ struct drm_i915_file_private {
> >>          unsigned int client_pid;
> >>          char *client_name;
> >>          struct kobject *client_root;
> >> +       struct kobject *busy_root;
> >>   
> >>          struct {
> >>                  struct device_attribute pid;
> >>                  struct device_attribute name;
> >> +               struct i915_engine_busy_attribute busy[I915_NUM_ENGINES];
> >>          } attr;
> >>   };
> >>   
> >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >> index 46ac7b3ca348..01298d924524 100644
> >> --- a/drivers/gpu/drm/i915/i915_gem.c
> >> +++ b/drivers/gpu/drm/i915/i915_gem.c
> >> @@ -5631,6 +5631,45 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
> >>          return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
> >>   }
> >>   
> >> +struct busy_ctx {
> >> +       struct intel_engine_cs *engine;
> >> +       u64 total;
> >> +};
> >> +
> >> +static int busy_add(int _id, void *p, void *data)
> >> +{
> >> +       struct i915_gem_context *ctx = p;
> >> +       struct busy_ctx *bc = data;
> >> +
> >> +       bc->total +=
> >> +               ktime_to_ns(intel_context_engine_get_busy_time(ctx,
> >> +                                                              bc->engine));
> >> +
> >> +       return 0;
> >> +}
> >> +
> >> +static ssize_t
> >> +show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
> >> +{
> >> +       struct i915_engine_busy_attribute *i915_attr =
> >> +               container_of(attr, typeof(*i915_attr), attr);
> >> +       struct drm_i915_file_private *file_priv = i915_attr->file_priv;
> >> +       struct intel_engine_cs *engine = i915_attr->engine;
> >> +       struct drm_i915_private *i915 = engine->i915;
> >> +       struct busy_ctx bc = { .engine = engine };
> >> +       int ret;
> >> +
> >> +       ret = i915_mutex_lock_interruptible(&i915->drm);
> >> +       if (ret)
> >> +               return ret;
> >> +
> > 
> > Doesn't need struct_mutex, just rcu_read_lock() will suffice.
> > 
> > Neither the context nor idr will be freed too soon, and the data is
> > involatile when the context is unreffed (and contexts don't have the
> > nasty zombie/undead status of requests). So the busy-time will be
> > stable.
> 
> Are you sure? What holds a reference to contexts while userspace might 
> by in sysfs reading the stat? It would be super nice if we could avoid 
> struct mutex here.. I just don't understand at the moment why it would 
> be safe.

RCU keeps the pointer alive, and in this case the context is not reused
before being freed (unlike requests). So given that it's unref, it is
dead and the stats will be stable, just not yet freed.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux