On Wed, Sep 30, 2015 at 9:30 AM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote:
In this case I don't think there's inherently any more compatibility that comes from using perf or not - no existing userspace will Just Work™ with the perf based OA driver.
On Tue, Sep 29, 2015 at 03:39:03PM +0100, Robert Bragg wrote:
> Updating Mesa and GPU Top to experiment with this was straightforward
> given the similarity to the perf interface. The main difference is that
> it only supports forwarding metrics via read()s instead of an mmaped
> circular buffer. As mentioned above, I think that suits this well, and
> requires no additional copying of data. I think the userspace code has
> ended up being a little simpler too.
Did you try updating the existing perf based overlay?
I don't recall the overlay attempting to read OA counters, but potentially it could be quite nice to add support - sorry I hadn't considered that so far.
I don't believe being perf based or not will affect the effort to do this though. The perf based driver doesn't handle OA counter normalization in the kernel so userspace needs to be able to handle that - which is probably the bigger effort.
Something to note here about your early pmu driver, is that it was notably for counters that were explicitly sampled from the cpu using a hrtimer via mmio. I think they were a better fit for the existing perf design than the OA unit, primarily because they were explicitly read from the cpu and each counter was very independent.
> Overall the driver currently isn't much more code than with perf (~200
> lines).
>
> Personally my gut feeling a.t.m, is that we should aim to move forward
> independent from perf.
>
> I'd really appreciate some feedback from others on this though.
>
> Daniel and Chris; although I think it made sense at the outset to try
> and use perf, in light of the above would you be open to a non-perf
> based driver for the OA unit?
No. I strongly dislike that they will be multiple incompatibile perf
interfaces and strongly like the coupling with other profiling that
comes with perf - i.e. we very much want to simultaneously sample CPU
and GPU workloads along with other devices, that information is much
more useful to me for the purposes of scheduling work and maximising
concurrency than optimising shaders.
In this case I don't think there's inherently any more compatibility that comes from using perf or not - no existing userspace will Just Work™ with the perf based OA driver.
I think some of the cases you're referring to may be ok to expose via the existing perf infrastructure, but I'm currently enabling the OA unit which poses some unique difficulties I've tried to explain.
A guiding differentiator may be whether or not the counter is orthogonal (in terms of configuration and normalization) and explicitly readable from the cpu, as to whether the existing perf pmu infrastructure is a good fit.
'i915 perf' shows my lack of imagination naming this and maybe another name could imply a more limited scope. I.e. on a case by case basis, when looking to expose a new counters we can still evaluate whether it makes sense to expose via the existing perf infrastructure or this.
- Robert
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel