Re: [PATCH 2/2] drm/i915/tracepoints: Remove DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Wed, 8 Aug 2018 13:56:01 +0100

+Joonas

On 08/08/2018 13:42, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-08-08 13:13:08)

On 26/06/2018 12:48, Chris Wilson wrote:
It's just that this about the third time this has been raised in the
last couple of weeks with the other two requests being from a generic
tooling pov (Eric Anholt for gnome-shell tweaking, and some one
else looking for a gpuvis-like tool). So it seems like there is
interest, even if I doubt that it'll help answer any questions beyond
what you can just extract from looking at userspace. (Imo, the only
people these tracepoints are useful for are people writing patches for
the driver. For everyone else, you can just observe system behaviour and
optimise your code for your workload. Otoh, can one trust a black
box, argh.)

Some of the things might be obtainable purely from userspace via heavily
instrumented builds, which may be in the realm of possible for during
development, but I don't think it is feasible in general both because it
is too involved, and because it would preclude existence of tools which
can trace any random client.

To have a second set of nearly equivalent tracepoints, we need to have
strong justification why we couldn't just use or extend the generic set.

I was hoping that the conversation so far established that nearly
equivalent is not close enough for intended use cases. And that is not
possible to make the generic ones so.

(I just don't see the point of those use cases. I trace the kernel to
fix the kernel...)

Yes and with virtual engine we will have a bigger reason to trace the 
kernel with a random client.

Plus I feel a lot more comfortable exporting a set of generic
tracepoints, than those where we may be leaking more knowledge of the HW
than we can reasonably expect to support for the indefinite future.

I think it is accepted we cannot guarantee low level tracepoints will be
supportable in the future world of GuC scheduling. (How and what we will
do there is yet unresolved.) But at least we get much better usability
for platforms up to there, and for very small effort. The idea is not to
mark these as ABI but just improve user experience.

You are I suppose worried that if these tracepoints disappeared due
being un-implementable someone will complain?

They already do...

I just want that anyone can run trace.pl and see how virtual engine
behaves, without having to recompile the kernel. And VTune people want
the same for their enterprise-level customers. Both tools are ready to
adapt should it be required. Its I repeat just usability and user
experience out of the box.

The out-of-the-box user experience should not require the use of such
tools in the first place! If they are trying to work around the kernel
(and that's the only use of this information I see) we have bugs a
plenty.

[snip because I repeated myself]

I think my issues boil down to:

  1 - people will complain no matter what (when it changes, when it is no
      longer available)

  2 - people will use it to workaround not fix; the information about kernel
      behaviour should only be used with a view to fixing that behaviour

As such, I am quite happy to have it limited to driver developers that
want to fix issues at source (OpenCL, I'm looking at you). There's tons
of other user observable information out there for tuning userspace,
why does the latency of runnable->queued matter if you will not do anything
about it? Other things like dependency graphs, if you can't keep control
of your own fences, you've already lost.

This is true, no disagreement. My point simply was that we can provide 
this info easily to anyone. There is a little bit of analogy with perf 
scheduler tracing/map etc.

I don't see any value in giving the information away, just the cost. If
you can convince Joonas of its merit, and if we can define just exactly
what ABI it constitutes, then I'd be happy to be the one who says "I
told you so" in the future for a change.

I think Joonas was okay in principle that we soft-commit to _trying_ to 
keep _some_ tracepoint stable-ish (where it makes sense and after some 
discussion for each) if IGT also materializes which auto-pings us (via 
CI) when we break one of them. But I may be misremembering so Joonas 
please comment.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx