Trond Myklebust wrote:
On Wed, 2008-08-20 at 14:14 +1000, Shehjar Tikoo wrote:
If I understand it correctly, there are three points at which
linux nfs client sends the NFS write request:
1. Inside nfs_flush_incompatible() where it needs to send writes
as stable because the pages are required for new write request
from an application. I think this happens only in case of high
memory pressure.
2. Inside nfs_file_write(), when nfs_do_fsync() is called if the
file was opened with O_SYNC.
3. When the file is closed, any remaining writes are flushed out
as unstable and then the final commit is sent.
In some of the tests I am running, I see drastic fall in write
throughput between a record size(.. i.e. the size of the buffer
handed to the write() syscall..) of 32Kbytes and a record size of
say 50 Mbytes and 100 Mbytes. This fall is seen for NFS wsize
values of 32k, 64k, 1Mb and with different tcp_slot_table_entries
values of 16, 64, 96 and 128. The test files are opened without
O_SYNC over a sync mounted NFS. The client is a big machine with
16 logical processors and 16Gigs of RAM.
I suspect that the fall happens because the NFS client stack
sends all the NFS writes as unstable till the file gets closed,
when it sends the final commit request. Since the write() record
sizes are pretty big the throughput drops because the final
commit takes extra-ordinarily long for the whole 100Megs to
commit at the server resulting in lower aggregate throughput.
Is this understanding correct?
Can this behaviour be modified so that the client uses the
knowledge of the write() buffer size, by initiating writeback
before the full 100megs needs to be committed to the server in
one go?
You fail to mention which kernels you are using for your testing,
but in most recent kernels you should be able to adjust the pdflush
background write rates using the tunables in /proc/sys/vm
The server is using 2.6.26 and the client is running 2.6.27-rc3.
By changing pdflush settings on the client, I'd be changing the
settings for the whole system. Is there a proc FS entry or any other
config param that lets me lower the number of write requests buffered
at client before the commit request is sent?
Thanks
Shehjar
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html