J. Bruce Fields wrote:
On Fri, Jun 13, 2008 at 04:58:04PM -0700, Dean Hildebrand wrote:
The reason it is an art is that you don't know the hardware that exists
between the client and server. Talking about things like BDP is fine,
but in reality there are limited buffer sizes, flaky hardware,
fluctuations in traffic, etc etc. Using the BDP as a starting point
though seems like the best solution, but since the linux server doesn't
know anything about what the BDP is, it is tough to hard code any value
into the linux kernel. As you said, if we just give a reasonable
default value and then ensure people can play with the knobs. Most
people use NFS within a LAN, and to date there has been little if any
discussion on using NFS over the WAN (hence my interest), so I would
argue that the current values might not be all that bad with regards to
defaults (at least we know the behaviour isn't horrible for most people).
Networks are messy. Anyone who wants to work in the WAN is going to
have to read about such things, no way around it. A simple google
search for 'tcp wan' or 'tcp wan linux' gives loads of suggestions on
how to configure your network, so it really isn't a burden on sysadmins
to do such a search and then use the given knobs to adjust the tcp
buffer size appropriately. My patch gives sysadmins the ability to do
the google search and then have some knobs to turn.
Some sample tcp tuning guides that I like:
http://acs.lbl.gov/TCP-tuning/tcp-wan-perf.pdf
http://acs.lbl.gov/TCP-tuning/linux.html
http://gentoo-wiki.com/HOWTO_TCP_Tuning (especially relevant is the part
about the receive buffer)
http://www.linuxclustersinstitute.org/conferences/archive/2008/PDF/Hildebrand_98265.pdf
(our initial paper on pNFS tuning)
Several of those refer to problems that can happen when the receive
buffer size is set unusually high, but none of them give a really
detailed description of the behavior in that case--do you know of any?
In an earlier post, I referred to the saw-tooth pattern that will happen
with the window when the sender transmits faster than the receiver can
receive. I believe bic and cubic try to reduce the impact by not
closing the window all the way, but it is still better to not
intentionally lose packets by setting the receive buffer too high. Not
sure if I sent this doc out already, but it also has some info on tuned
buffers vs. parallel tcp streams. It also shows some graphs of the
window closing once too many packets are lost.
http://acs.lbl.gov/TCP-tuning/TCP-Tuning-Tutorial.pdf
Sections 2.1 and 2.2 of the following paper published SC2002 give an
interesting intro to tuning tcp buffers and the ups and downs of using
parallel TCP streams. They quote the gridftp papers and indicate that
the best performance is with parallel tcp streams and tuned buffers.
They give the danger of setting a buffer size too big as:
"Although memory is comparably cheap, the vast majority of the
connections are so small that allocating large buffers to each flow can
put any system at risk
of running out of memory."
http://www.supercomp.org/sc2002/paperpdfs/pap.pap151.pdf
(Note, both of the following docs are from the same person. There are
other docs, they are don't seem to be quite as clear.)
Dean
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html