On Apr 14, 2009, at 8:48 PM, Simon Kirby wrote:
Hello!
I have a usual process of downloading pictures from a flash card (@ 15
MB/s or so) and writing them over 100 Mbps Ethernet (@ 12 MB/s or so).
One would expect and hope that both the reading and writing could
happen
simultaneously to optimize throughput, but the current behaviour on
both
NFSv3 and NFSv4 is as follows:
multiple files loop (copying with "cp"):
open source, dest
data copy loop:
read(source)
write(dest)
close(source)
close(dest)
The inner loop happens at about the rate of the flash card reader
all the
way up to my picture size (12-25 MB). Then, on close(), rpciod /
the NFS
client flushes all data over the network, at the rate which the
network
can sustain.
Overall throughput is therefore about 1/(1/12+1/15) == 6.67 MB/s,
which
is not very exciting.
I find that replacing "cp" with "dd ... bs=131072 oflag=dsync" lets me
copy at near network speed, at the expense of slowing down copying
to a
local hard drive should I chose to do that instead, and seems to be
more
of a workaround than a solution (and it's very sensitive to block size
and still slower than network speed).
Is there any way to convince NFS (or buffer flushing) to start
sooner in
this case -- preferrably when there are at least wsize bytes
available to
write? Is there any downside to doing this?
VM/VFS and NFS client both delay writes aggressively. A page cache
flush is forced by the close(2) call, but the client will hold onto
dirty data until the last possible moment. It's kind of a system-wide
policy, and yes, we know it's not so good for NFS.
There are some VM sysctls that can tune down the maximum amount of
dirty writes allowed to be outstanding. Have a look at /proc/sys/vm/
dirty_ratio and /proc/sys/vm/dirty_background_ratio. The problem with
these is that a) they are system wide, so the settings affect all of
your file systems, and 2) it's a ratio, so I don't think you can tune
it to flush files smaller than 1% of your system's physical RAM. On a
system with one gigabyte, that means you are still caching about 10MB
before starting to flush. I'm guessing your flash files are smaller
than that.
Another solution is to change your application. Calling
sync_file_range(2) in asynchronous mode every so often in your loop
might be sufficient to kick the VM into flushing the data sooner.
Other than some special-case handling with deleting a temporary file
before closing it (does that even work?), I don't see how the current
behaviour helps performance in _any_ case, even when copying from fast
media.
I looked around the NFS man pages, /proc and /sys and didn't see
anything
that might be helpful, but I am interested to find out how things
came to
arrive at this implementation.
Cheers!
Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs"
in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html