On 09/04/2009 11:17 AM, Rick Peralta wrote:
Hi Jeff,
After looking at the ChunkD code and related systems it is not clear that multi-threading will be a performance win and may have adverse effects.
The critical issues come down to the storage, network and application. Given the server cluster application, a Gbe network interface and rotating media storage, it is clear that the disk media speed is the bottleneck and seek time a significant hazard. The set-up time for a transfer is relatively small compared to the disk time for large transfers. Allowing multiple threads may cause thrashing of the I/O system, driving seek time into the lead bottleneck.
We need a better model and testing to see what really makes sense in order to proceed intelligently.
The problem space for static file serving is fairly well studied at this
point, I think. Linux went through the growing pains of tuning static
file web serving a while back, yielding epoll(7), sendfile(2) and other
tools.
In general the idea is to set up transfers, and let the kernel
intelligently schedule resources.
Multiple threads are necessary to reach full utilization of the storage
system. Even with pagecache and disk writeback caching, the normal
configuration, single-threaded non-blocking I/O cannot fully utilize a
modern, tagged-command-queueing storage device.
Single-threaded non-blocking I/O is also not non-blocking... for
storage. Standard POSIX API behavior. That means you stall all
connected clients, in order to perform the I/O. If the I/O takes
abnormally long, perhaps due to a filesystem journal commit, the daemon
is sitting idle when it could be performing other work with other clients.
In general, multiple threads is the only way to keep all pipelines full.
With regards to the I/O system thrashing, seek time, etc.: in general,
you always want start out letting the kernel manage that stuff, and then
only create a hand-built I/O subsystem that uses O_DIRECT if we starting
seeing problems [that others in other apps have not seen...]
Also, there are a wide variety of threading options. Using Pthreads is probably the most broadly understood and supported.
The existing GLib framework's gthread stuff should be fine for us. It
works on all Unices plus Windows, using kernel threads where possible.
Jeff
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html