Not mine. If I’m investigating where some latency comes from, the last thing I want to is to have all-encompassing printouts that would themselves be a primary cause of latency.
Yes. But what I’m presently interested in are performance issues. And the messages I need are presently not in spice, nor is there a sufficient infrastructure to add them. Definitely, glib does not do what I want, see below for more details.
Let me take some examples straight from Tao3D, which I dearly miss in spice today (with actual output from Tao3D for the first two): 1. fps: Report frames per second. Time;PageNum;FPS;Exec;MaxExec;Draw;MaxDraw;GC;MaxGC;Select;MaxSelect;GCWait;MaxGCWait 1496761089.058;1;0.000;-1;0;-1;0;-1;0;-1;0;-1;0 1496761093.755;1;13.800;50;3217;9;482;1;13;0;0;0;9 1496761094.756;1;55.400;13;3217;3;482;0;13;0;0;0;12 1496761095.759;1;94.200;8;3217;2;482;0;13;0;0;0;12 1496761096.759;1;136.400;6;3217;2;482;0;13;0;0;0;12 2. memory: Report memory usage on a regular basis (ideally, component by component, e.g. separate codecs from spice code) NAME TOTAL ALLOC FREED N2XL4TreeE 0 0 0 N2XL7IntegerE 6132 1527 224 N2XL4RealE 4088 215 3478 N2XL4TextE 6132 1217 336 N2XL4NameE 40880 15664 0 N2XL5InfixE 33726 10246 332 N2XL6PrefixE 12264 3177 0 N2XL7PostfixE 1022 254 0 N2XL5BlockE 5110 1017 112 N2XL7ContextE 1022 15 0 N2XL7SymbolsE 4088 1432 0 N2XL7RewriteE 9198 5519 0 ===== ===== ===== ===== Kilobytes 9820K 3293K 223K [GCThread] Collect time 3 ms [GCThread] GC collect started [GCThread] Collect time 0 ms [GCThread] GC collect started 3. bandwidth: Report network bandwidth usage (incoming and outcoming) 4. latency: Report network observed latency
My experience tells me the exact opposite. If you have a structure like this in place, you start putting a bit more investment in your traces, so that they can be reused later.
Valgrind? Running a heavily modified executable at a quarter of the speed on a good moon? And clearly, based on the massive memory leak I reported last week, such tools are not used on a regular basis. The purpose of something like -t memory is to give you a broad overview. Be able to say “we use 3G of memory, 1.5G of which is for graphics, 1G of which for cached images”. So that you know where to focus your effort. Using valgrind to give you that information? Tell me how.
In both cases, it’s runtime configuration intended primarily for developers. To better see the continuity between the two, imagine that we have a trace focusing on JPEG encoding. But now, if we focus on understanding JPEG quality, I can do something like -t jpeg -t jpeg_quality=50. So it’s really part of the same kind of usage patterns.
The domains if you want to call them that. But I’d rather leave “domain” for glib domains, and traces / tweaks for what I’m proposing here. It’s different concepts, much like domains and command-line options are different, even if you may use a domain name as a grep argument.
This is exactly why in this implementation, the macros in spice-traces.def mandate an “info” field for self-documentation.
The SPICE_TRACE environment variable can be used to configure that if that’s what you prefer. I personally prefer a short command-line option.
I stand by my comment that it’s about how and where to log things. How: using named fields instead of pure ASCII text. Where: in a shared log, that can be merged between applications, with the benefits you listed, namely that it can be queried. I don't know if your solution could fit. If it does, it would be great to propose it to glib instead. I could probably make conditional tracing part of glib, but only with a much more expensive implementation. In what I proposed, each flag is a bit in a structure. So when you write IFTRACE(foo), it translates to nothing more than if (spice_traces.foo). This relies on the compiler and linker doing the lookup job at compile-time. A library implementation would have to do a run-time lookup of the trace name, making it much more expensive. Alternatively, a header-only implementation is feasible, but that makes it much more difficult to statically define where flags are stored. For instance, say glib started using it, then glib would for example need a “glib” set of flags, and a “spicy” set of flags. Let’s say two structures. An implementation of IFTRACE(flag) would have to somehow remember if “flag” was defined in “glib” or in “spicy”. So you end up with a much more complex implementation. At some point, we did something like that for XL vs Tao3D, but that used some C++ trickery that I don’t know how to replicate in C (using namespace lookup IIRC, I’d have to check the code). Or you could forget about being smart, and just write something like IFTRACE(glib.flag) and IFTRACE(spicy.flag), with possibly appropriate macros. Hmm. Thinking aloud here. Why not…
At the moment, I care more about instrumenting Spice to understand its behavior. But for what it’s worth, the current GTK implementation does not make any use of anything in glib to implement its conditional tracing code. What they have is a more naive version of what I did. It’s more naive because it’s not scalable (limited to 64 flags on a 64-bit system), and you have to hand-edit flags, options, etc. So it’s complicated to add a new trace. Here is for example what it takes to add a “snapshot” trace: https://github.com/GNOME/gtk/commit/def94f03e240babfd589c8ba8e7cc4961a3d6aab Contrast with what it would take with my implementation: https://github.com/c3d/spice-common/commit/5b2919615c0f55f05a0e1ec6c1a5f238e97b00f8. The combination of “not scalable” and “complicated to add a trace” probably explains why there are presently only 20 traces active in the whole code. It's also unnecessarily slow, since it goes through a loop iterating over displays in gtk_get_display_debug_flags() to find display-specific flags, a non-requirement for spice (and if it became a requirement, looping over all displays would probably the last thing on my mind to implement it).
Yes. Using a manual bitflag is what I called a “naive” implementation. My use of the pre-processor allows the code to generate option parsing to be generated automatically. So you change one file and only one file. And you don’t need to manually write things like 1<<19 when you add a trace.
Well, the kernel tracing infrastructure is intended to trace stuff inside a kernel. That a system using kernel facilities extensively like KVM/QEMU uses the kernel tracing facilities makes quite a bit of sense. That’s one case where you want to see events in the proper sequence between kernel and user-space operations. An argument could be made for tracing infrastructure in, say, the QXL driver. But for spicy, it does not make any sense to me.
For the kernel, yes. Do you know of any purely user-space program that uses ftrace for their own internal state?
As I wrote earlier, that kernel infrastructure is much closer to the “recorder” I talked about earlier. The recorder is *much* simpler than the kernel trace API. Two headers (ring buffer and recorder), one C file, 1262 LOCs with comments. By comparison, last time I checked, the kernel tracing code was 51308 lines of code, 8 headers, automatic code-generation tools, etc. And unlike the kernel tracing subsystem, the recorder is specifically designed to integrate non-intrusively in a user-space program. There’s something to be said for simplicity. So far, I have integrated “my” recorder in Mesa, Spice or BTRFS in a couple of hours each time. There’s no way to do that with the kernel infrastructure IMO. But again, this has absolutely nothing to do with conditional tracing code. This is yet another case a “where do I store stuff and how” (e.g. for the recorder in a ring buffer). This has nothing to do with enabling / disabling specific debug code (although, granted, the ftrace interface does have this capability, but in a way that only applies to the kernel, i.e. writing to /sys files, not command-line options or environment variables).
Actually, the purpose of this kind of tracing is to offer domain-specific, focused tracing. So in the majority fo cases, you don’t need to consider what is happening outside of the domain, on the contrary, you really want it out of the way. And if/when you do care about it, then you get that data in your instrumentation, not by activating some tracing in another domain. For example, for Tao3D, number of polygons is important, but the -t polygons does not give you that information by activating some polygon tracing in the OpenGL domain. Instead, it collects statistics relevant to Tao3D from OpenGL, and presents them in a usable form.
Agreed, ftrace is definitely not a solution to my problem, let alone the best one. I can consider making a glib version of spice_trace(). But I think it will be unnecessarily complicated to get it accepted. And I don’t see what we’d gain from it.
Well, we may have a long list of tools and libraries, but right now, prying even what I consider the most basic information out of Spice is a real pain IMO. If you have a nice grep for the use cases I listed above (e.g. FPS, network bandwidth, memory usage, etc), let me know. For now, to me, this is a problem that is not solved. And that I believe my proposal would solve relatively elegantly. Also, while https://bugs.freedesktop.org/show_bug.cgi?id=91838 mentions a possible solution to the same class of problems that my proposal solves, I believe the solution presented here is much simpler and easier to implement. 91838 has been opened since Sep 2015. My solution took me a couple of days to implement and test. As for integrating the recorder… well, if there is an existing framework that does what I’m doing with it, then fine. I don’t know of any, and I looked. Christophe
|
_______________________________________________ Spice-devel mailing list Spice-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/spice-devel