On 09/04/2009 03:49 PM, Trond Myklebust wrote:
On Fri, 2009-09-04 at 15:30 -0700, Ben Greear wrote:
I was thinking that the kernel might take the data received in the skb's from
the file-server and send it to /dev/null, ie basically just immediately
discard the received data. If it could do that, it would be a zero-copy
read: The only copying would be the NIC DMA'ing the packet into the skb.
No... The RPC layer will always copy the data from the socket into a
buffer. If you are using O_DIRECT reads, then that buffer will be the
same one that you supplied in userland (the kernel just uses page table
trickery to map those pages into the kernel address space). If you are
using any other type of read (even if it is being piped using sendfile()
or splice()) then it will copy that data into the NFS filesystem's page
cache.
Ok, I think I understand that better now. Seems like one could have
RPC use a list of skbs as data store instead of copying the data,
but perhaps that would be optimizing for something no one would
ever really want in the real world.
Out of curiosity, any one have any benchmarks for NFS on 10G hardware?
I'm not aware of any public figures. I'd be interested to hear how you
max out.
Based on testing against another vendor's nfs server, it seems that the client
is loosing packets (the server shows tcp retransmits).
Is the data being lost at the client, the switch or the server? Assuming
that you are using a managed switch, then a look at its statistics
should be able to answer that question.
At least for my local linux - linux tests, I'm using just fibre optic
cable to connect them, so definitely not a switch problem here. No obvious errors
reported by either NIC, and pktgen tests show that they can easily sustain
9Gbps. I need to do more detailed looking at the netstat
counters and such. I suspect I may have too-small network buffers. I last
set up their defaults when a 1GB RAM system was 'high end', and now
I'm using 12GB systems :P
Thanks,
Ben
--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html