Boot poster challenge

Owen Taylor <otaylor@xxxxxxxxxx> · Sat, 13 Nov 2004 12:18:39 -0500

Problem description
===================

Currently, the time to boot the Linux desktop from the point where the
power switch is turned on, to the point where the user can start doing
work is roughly two minutes.

During that time, there are basically three resources being used: the
hard disk, the CPU, and the natural latency of external systems - the
time it takes a monitor to respond to a DDC probe, the time it takes
for the system to get an IP via DCHP, and so forth.

Ideally, system boot would involve a 3-4 second sequential read of
around 100 megabytes of data from the hard disk, CPU utilization would
be parallelized with that, and all queries on external systems would
be asynchronous ... startup continues and once the external system
responds, the system state is updated. Plausibly the user could start
work under 10 seconds on this ideal system.

The challenge is to create a single poster showing graphically what is
going on during the boot, what is the utilization of resources, how
the current boot differs from the ideal world of 100% disk and CPU
utilization, and thus, where are the opportunities for optimization.

Graphical Ideas
===============

Presumably, the main display would be a timeline with wall clock time
on the horizontal (or vertical) axis. Then, you'd have a tree with
lines representing the processes running at a particular time.

The processes lines would have attributes indicating state - perhaps
red when waiting for disk, green when running, dotted when sleeping or
blocking on IO. Extra lines might be added to the graph to indicate
dependencies between processes. If a process calls waitpid() on
another process, a dotted line could be added connecting the end of the
other process back to the first process. Similar lines could be added
when a write from one process causes another process that was waiting
in a read() or select() to wake up.

While many thousands of processes are run during system boot, this
doesn't mean the graph has to have vertical space for all of them
... vertical space is basically determined by the number of processes
that are running at once.

Parallel to the display of processes would be a display of overall CPU
and disk utilization. CPU utilization on a single processor system is
pretty straightforward... either the CPU is running at a point in time
or it isn't. Considerations like memory bandwidth, processor stalls,
and so forth matter when optimizing particular algorithms but an
initial guess (that the poster would confirm or deny) is that CPU is
not a significant bottleneck for system start.

Disk utilization is more complex, because of the huge cost of seeks;
while modern drives can easily read 30-40 megabytes/second a seek
still takes 5-10ms. Whether or not the drive is active tells little
about how well we are doing using it. In addition, there is a
significantly long pipeline of requests to the disk, and seeks aren't
even completely predictable because the drive may reorder read
requests.

But a simple display that might be sufficient is graph of
instantaneous bandwidth (averaged over a small period of time) being
achieved from the disk drive. If processes are red (waiting on the
drive) and the bandwidth is low, then there is a problem with too much
seeking that needs to be addressed.

You'd also want text in the poster; process names are one obvious
textual annotation that should be easy to obtain. It might also be
interested for processes to be able to provide extra annotations; for
the X server to advertise that it is waiting for a DDC probe, and so
forth.

Implementation thoughts
=======================

It should be possible to start with a limited set of easily collected
data and already get a useful picture. Useful data collection could be
as simple as taking a snapshot of the data that the "top" program
displays a few times a second during boot. That already gives you a
list of the running processes, their states, and some statistics about
global system load.

Moving beyond that would probably involve instrumenting the kernel to
give notification of process start and termination (possibly providing
times(2) style information on termination) to provide visibility for
processes that run for too short a time to be picked up by
polling. Better kernel reporting of disk utilization might also be
needed.

It might be possible to employ existing tools like oprofile, however,
the level of detail oprofile provides is really overkill... 
compressing 2 minutes of runtime involving 1000 processes onto 
a single poster doesn't really allow worrying about what code
is getting run by a process at a particular point.

Obviously, one challenge of any profiling tool is to avoid affecting
the collected data. Since CPU and memory don't seem to be bottlenecks,
while disk definitely is a bottleneck, a low impact implementation
might be a profiling daemon that started early in the boot process and
accumulated information to be queried and analyzed after the boot
finishes.

While producing a single poster would already be enormously useful,
the ability to recreate the poster on any system at any point would be
multiply times more so. So, changes to system components that can be
gotten into the upstream projects and that can be activated at runtime
rather than needing to be conditionally compiled in are best.

Motivation
==========

I think this project would be a lot of fun to work on; you'd learn a
lot about how system boot up works and about performance measurement.
And beyond that there is a significant design and visualization
element in figuring out how to display the collected data. It would
also make a good small-scale academic project.

But to provide a little extra motivation beyond that, if people pick
this up and come up with interesting results, I'll (personally) pay
for up to 3 posters of up to 4' x 6' to be professionally printed and
laminated. I'll be flexible about how that works ... if multiple
people collaborate on one design, they can get a copy each of that
single design.

- Owen Taylor
Attachment:
signature.asc

Description: This is a digitally signed message part