On Wed, 2004-05-05 at 23:11, Will Cohen wrote: > I work on performance tools at Red Hat. I have been told there is > interest in tuning the desktop to improve performance. I have a number > of questions to help identify the work needed in this area. I would be > interested in any answers that people have for the following questions. > > > What is the set of software in the "Desktop" (executable names and/or > RPM packages)? For the last few years I've been working on and off on the performance of Nautilus. At this time Nautilus is a lot better performance-wise, but we still don't fully grasp the performance properties of it. I'll try to give you some idea of what i've been doing. > What specific performance problems have people observed so far in the > desktop? For example heavy CPU or memory usage by particular > applications. Another example long latency between event and resulting > action. Nautilus has had various problems. One important one is the time it takes to open a new window, other ones are startup time, time to read a large directory, and total memory use. > What metrics were used to gauged the effect of software changes on > performance? In general, the slowness have been on a scale that you could use a handwatch to time it (for e.g. directory load), and at other times I've put in printfs() to print time() at some specific points in the app. Often you can see the performance increase by just using the app. > What performance tools have people used so far to identify performance > problems with desktop applications? I use a variety of tools: * printfs in strategic places to try to figure out what gets called, when it gets called, and how long it takes. * Sprinkle "access ("doing <foo>", 0) in the code, then run the app under strace -tt, which will show you what sort of i/o is done. You can look at the access lines in the log to see what is happening at the code level, including timestamps. * I've used the sampling profiler in eazel-tools in gnome cvs. This is a sampling profiler that you LD_PRELOAD into your app. Its not perfect, but it gives you at least some data when used with shared libs (as opposed to gprof). It gives gprof output. * KCachegrind. I've only used this a bit, the performance of Nautilus while running it is pretty poor, so its hard to use. * memprof. This is an excellent app for tracking down leaks and large users of memory. > How well or poorly did the performance tools work in identifying the > performance problem? While they did help, they are not as useful as I would like. They require a lot of work to set up, and the presentation/data-mining features are pretty limited. In general, debugging desktop apps is quite different from lowlevel, non-interactive apps. First of all they are generally much more structurally complex, relying on many shared libraries, several processes with various sorts of IPC and lots of file I/O and user input. Secondly, the typical call traces are very deep (60 or even 80 frames are not uncommon), and often highly recursive. A typical backtrace involves several signal emissions, where each emission is on the order of 5 function calls deep (just for the signal emission code, not the called function). These functions are also typically the same, so they intermingle the stack traces. Take this simplified backtrace for instance: A: signal_a_handler () - foo(); return TRUE; g_signal_emit () - locate callback, call caller_a() - g_signal_emit (object, "signal_a", data) B: signal_b_handler () - bar(); return TRUE; g_signal_emit () - locate callback, call caller_b() - g_signal_emit (object, "signal_b", data) When looking at a profile for code such as this, what you see is that caller_a() uses a lot of time, but when you go into it to see what it does, you end up looking at g_signal_emit, which gets called by lots of other places like B, so its very hard to figure out what of that is actually from the A call. It gets even worse in the (very common) situation of the signal handler itself emitting a signal. This creates a mutual recursion into the g_signal_emit() function similar to the a+b case above: signal_b_handler() g_signal_emit ("signal b") signal_a_handler () g_signal_emit ("signal a") caller_a() When stepping into the g_signal_emit from signal_a_handler it seems like that calls signal_a_handler again, since thats another child of g_signal_emit. Profilers just don't handle this very well. Here is a couple of issues I have with current profiling tools: * They have no way of profiling i/o and seeks. A lot of our problems is due to reading to many files, reading files to often, or paging in data/code. Current profilers just don't show this at all. * Little support for tracking issues wrt IPC calls between different processes. Whether this be X inter-client calls for e.g. DnD, or Corba calls to some object. * Poor visialization support of the data, especially with mutually recursive calls as described above. Generally, all of the fixes I've done has been of the type "Don't do this incredibly stupid thing". Whether that has been O(n^2) algorithm due to treating a list as an arrays in loops, reading the same file over and over again, or something else. I've never *once* had to count cycles in some hot function or anything like that, Its always about adding a cache, doing something in a different way, or just avoid doing the expensive stupid thing. However, the stupidities are burried in lots and lots of code, and finding them in all the data a profiler spews out is the real hard part. > Were benchmarks used to test performance of desktop applications? If > so, what type of benchmarks were used (e.g. micro benchmarks or > measuring the amount of time required to do something in an > application program)? Typically not. They were all ad-hoc testing by the developer as part of trying to track down some specific slowness. > Were the benchmarks runnable in batch mode without human assistance? Never. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander Larsson Red Hat, Inc alexl@xxxxxxxxxx alla@xxxxxxxxxxxxxx He's a superhumanly strong flyboy senator on the wrong side of the law. She's a time-travelling motormouth detective with only herself to blame. They fight crime!