Re: Performance tuning the Fedora Desktop

Will Cohen <wcohen@xxxxxxxxxx> · Wed, 12 May 2004 17:57:27 -0400

 I would like to thank people for the response to the thread on
performance tuning the Fedora Desktop. The responses have been very
helpful.  Below is what I have written up as a result of the comments
I got.

APPLICATIONS

There are a large number of packages that could be considered part of
the desktop. However, the most critical ones appear to be the mail
clients, web browsers, word processors, and window manager. There were
several comments about up2date/rpm related software also.

A short list of important apps rpms:

     evolution
     mozilla
     ooffice
     gnome-terminal
     gnome-* 

*FIXME* would like to narrow this down to a list of executables *FIXME*

PERFORMANCE PROBLEMS

The performance issues with Desktop applications differs signficantly
from the tradition single-threaded, batch-oriented benchmarks, like
SPEC CPU2000. The desktop applications often have multiple
threads. The footprints of the desktop applications are larger due to
the number of shared libraries being used to do GUI and network
related operations. Also users typically have several desktop
applications open concurrently. As a result, Cache and paging are more
likely to be issue with desktop applications. Optimizing for code size
may be boost performance more than trying to use aggressive compiler
optimizations that increase program size.

Latency is more of an issue than throughput for the desktop
applications. How long it takes for the action for a mouse click to be
observed is more important than the machine can do a certain number of
the actions in a given period of time. Interactivity is the issue;
ideally, the actions should be below a person's threshold of
detecting, for example a new window appears to pop up instantly but in
reality it may take 10millisecond to do.

Some latency issues may be outside the applications control. For
example performance limited by DNS lookups to convert URL address into
IP number.  Actual rate that network provides data to web browser
affect the perceived performance.  With the exception of eliminating
the dependency on the outside application, nothing on the local
machine is going to improve performance.

METRICS

Unfortunately, many people's metrics for desktop applications were
literally eyeballed, click a menu item see how long it takes for the
result to occur. This is difficult to automate and script.  We really
want benchmarks where at the very least the actual measure does not
require the person to measure by hand and then transcribe the
resulting measurement to a machine readable format.  Better still
would an automated test like the performance tracking for GCC,
http://people.redhat.com/dnovillo/spec2000/.  This would make it
easier to see when a code change fixed a performance problem (or
introduced a performance problem).  Wall clock time obtained from
strategic printf of gettimeofday and strace were uses to get timing
informatoin.

Memory footprint of desktop application another metric of
interest. The larger the memory footprint the more time it takes to
start up.  Also large memory footprint application are likely to have
more cache misses and page fault, slowing the execution of the
program.

File accesses can also affect performance. When a desktop application
is initially started shared library files need to be
opened. Additionally, other file with preference information may be
opened. In the case of Nautilus it may need to examine all the files
in the directory it is browsing. Restructing the code to reduce the
number of file access could improve the performance of applications.

Round trip for X protocol affect the latency of operations. For
example client sends X server message to do operation and acknowledge
when the operation is complete can hurt performance over channels with
long latency. This type of problem affected the graphical installer
for a distribution of linux for s390. Every screen update required a
round trip, which tripled the install time because a round trip
latency was encountered for every update of the progress bar.

PERFORMANCE TUNING TOOLS

The data collectedy by the performance tools does not need to be
exact, in many cases tools are being used to identify areas of code
that will make a difference in performance. Whether 30% or 33% percent
of the time is spent in a single rountine is not that big a
difference.  Currently, developers are looking for the things that
have large impact on performance.  Tools that significantly slow the
execution of the code change the interactions between the user and the
code. The instrumention can also change the interactions between
instrumented and uninstrumented code. The slowness of the code and the
perturbations of the system make detailed instrumentation less
attractive to developers.

Some of the optimization work is finding inappropriate algorithms or
data structures used for particular tasks.  Knowing the context that a
function is called is important, so call graph information is
essential.  Call graphs for desktop applications are complicated due
to recursive code, signal handling, and co-routines.

Some GUI developer have developed profiling tools. For example sysprof by
Soeren Sandmann for Gnome
(http://www.daimi.au.dk/~sandmann/sysprof-0.02.tar.gz) and Mozilla
performance tools http://www.mozilla.org/performance/.

COURSE OF ACTION

1) Develop Desktop benchmarks

   Need to have some benchmarks to determine performance.

   Very incomplete list of suggested benchmarks:
   -gdm login to setup desktop
   -menu select desktop program _____ to time desktop program ready to use
   -cat text file to xterm

   Should have clear procedures for each so the results can be
   generated by anyone and compared.  It would be a bonus if can run
   benchmark from commandline, so the data collection can be
   automated.

2) Get baseline metrics on benchmarks

   Have baseline to determine whether code changes are increasing or
   decreasing performance. Allow us to avoid the nebulous "feels
   faster" and "feels slower". Also use this data to find out where
   the most significant problems are, for example important
   application take ten minutes to start.

3) Improve performance monitoring/tuning tools and scripts

*FIXME* make the tuning tools information more concrete *FIXME*

   a) Need trigger mechanism to start and stop profiling data
   collection on certain conditions or events. For example start
   profiling when menu item selected and stop profiling when action
   complete. This would avoid interesting samples getting lost in the
   sea boring long term sampling.

   b) Better tools to map out memory foot print. Reducing memory use
   is likely to help performance by reducing time to load application
   and related shared libraries. Related to this is consider tools to
   reorder functions in code to get better locality (like grope) and
   produce hot and cold code.

   c) Easier means of navigating performance data. For example break
   down of time spent in parent and children. Maybe pull some of the
   sysprof data analysis into oprofile data analysis. Also maybe use
   the OProfile plug-in for eclipse to visualize data.

   d) Take advantage of the uses of shared libraries in code to insert
   instrumentation between the application and the function in the
   shared library when the library is load/linked in.

-Will