Re: [RFC 4/5] drm/i915: Expose per-engine client busyness

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 15/02/2018 09:44, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-02-15 09:41:53)

On 14/02/2018 19:17, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-02-14 18:50:34)
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Expose per-client and per-engine busyness under the previously added sysfs
client root.

The new files are one per-engine instance and located under the 'busy'
directory.

Each contains a monotonically increasing nano-second resolution times each
client's jobs were executing on the GPU.

$ cat /sys/class/drm/card0/clients/5/busy/rcs0
32516602

This data can serve as an interface to implement a top like utility for
GPU jobs. For instance I have prototyped a tool in IGT which produces
periodic output like:

neverball[  6011]:  rcs0:  41.01%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
       Xorg[  5664]:  rcs0:  31.16%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%
      xfwm4[  5727]:  rcs0:   0.00%  bcs0:   0.00%  vcs0:   0.00%  vecs0:   0.00%

This tools can also be extended to use the i915 PMU and show overall engine
busyness, and engine loads using the queue depth metric.

v2: Use intel_context_engine_get_busy_time.
v3: New directory structure.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
   drivers/gpu/drm/i915/i915_drv.h |  8 ++++
   drivers/gpu/drm/i915/i915_gem.c | 86 +++++++++++++++++++++++++++++++++++++++--
   2 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 372d13cb2472..d6b2883b42fe 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -315,6 +315,12 @@ struct drm_i915_private;
   struct i915_mm_struct;
   struct i915_mmu_object;
+struct i915_engine_busy_attribute {
+       struct device_attribute attr;
+       struct drm_i915_file_private *file_priv;
+       struct intel_engine_cs *engine;
+};
+
   struct drm_i915_file_private {
          struct drm_i915_private *dev_priv;
          struct drm_file *file;
@@ -350,10 +356,12 @@ struct drm_i915_file_private {
          unsigned int client_pid;
          char *client_name;
          struct kobject *client_root;
+       struct kobject *busy_root;
struct {
                  struct device_attribute pid;
                  struct device_attribute name;
+               struct i915_engine_busy_attribute busy[I915_NUM_ENGINES];
          } attr;
   };
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 46ac7b3ca348..01298d924524 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5631,6 +5631,45 @@ show_client_pid(struct device *kdev, struct device_attribute *attr, char *buf)
          return snprintf(buf, PAGE_SIZE, "%u", file_priv->client_pid);
   }
+struct busy_ctx {
+       struct intel_engine_cs *engine;
+       u64 total;
+};
+
+static int busy_add(int _id, void *p, void *data)
+{
+       struct i915_gem_context *ctx = p;
+       struct busy_ctx *bc = data;
+
+       bc->total +=
+               ktime_to_ns(intel_context_engine_get_busy_time(ctx,
+                                                              bc->engine));
+
+       return 0;
+}
+
+static ssize_t
+show_client_busy(struct device *kdev, struct device_attribute *attr, char *buf)
+{
+       struct i915_engine_busy_attribute *i915_attr =
+               container_of(attr, typeof(*i915_attr), attr);
+       struct drm_i915_file_private *file_priv = i915_attr->file_priv;
+       struct intel_engine_cs *engine = i915_attr->engine;
+       struct drm_i915_private *i915 = engine->i915;
+       struct busy_ctx bc = { .engine = engine };
+       int ret;
+
+       ret = i915_mutex_lock_interruptible(&i915->drm);
+       if (ret)
+               return ret;
+

Doesn't need struct_mutex, just rcu_read_lock() will suffice.

Neither the context nor idr will be freed too soon, and the data is
involatile when the context is unreffed (and contexts don't have the
nasty zombie/undead status of requests). So the busy-time will be
stable.

Are you sure? What holds a reference to contexts while userspace might
by in sysfs reading the stat? It would be super nice if we could avoid
struct mutex here.. I just don't understand at the moment why it would
be safe.

RCU keeps the pointer alive, and in this case the context is not reused
before being freed (unlike requests). So given that it's unref, it is
dead and the stats will be stable, just not yet freed.

Somehow I missed the suggestion to replace with rcu_read_lock. Cool, this is then even more light-weight than I imagined.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux