Re: Best A->B large file copy performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2009-03-12 at 17:00 -0400, Jim Callahan wrote:
> I'm trying to determine the most optimal way to have a single NFS client 
> copy large numbers (100-1000) of fairly large (1-50M) files from one 
> location on an file server to another location on the same file server.  
> There seem to be several API layers which influence this:
> 
> 1. Number of OS level processes performing the copy in parallel.
> 2. Record size used buy the C-library read()/write() calls from these 
> processes.
> 3. NFS client rsize/wsize settings.
> 4. Ethernet MTU size.
> 5. Bandwidth of the ethernet network and switches.
> 
> So far we've played around with larger MTU and rsize/wsize settings 
> without seeing a huge difference.  Since we have been using "cp" to 
> perform (1), we've not tweaked the record size at all at this point.   
> My suspicion is that we should be carefully coordinating the sizes 
> specified in for the layers 2, 3 and 4.  Perhaps we should be using "dd" 
> instead of "cp" so we can control the record size being used.   Since 
> the number of permutations of these three settings are large I was 
> hoping that I might get some advise from this list about a range of 
> values we should be investigating and any unpleasant interactions 
> between these levels of settings we should be aware of to narrow our 
> search.  Also, if there are other major factors outside those listed I'd 
> appreciate being pointed in the right direction.

MTU, and rsize/wsize settings shouldn't matter much unless you're using
a UDP connection. I'd recommend just using the default r/wsize
negotiated by the client and server, and then whatever MTU is most
convenient for the other applications you may have.

Bandwidth and switch quality do matter (a lot). Particularly so if you
have many clients...

If you're just copying and not interested in using the file or its
contents afterwards, then you might consider using direct i/o instead of
ordinary cached i/o.

> While I'm on the subject, has there been any discussion about adding an 
> NFS request that would allow copying files from one location to another 
> on the same NFS server without requiring a round trip to a client?  Its 
> not at all uncommon to need to move data around in this manner and it 
> seems a huge waste of bandwidth to have to send all this data from the 
> server to the client just to have the client send the data back 
> unaltered to a different location.  Such a COPY request would be high 
> level along the lines of RENAME and each server vendor could optimize 
> this for their particular hardware architecture.  For our particular 
> application, having such a request would make a huge difference in 
> performance.

I don't think anyone has talked about a server-to-server protocol, but I
believe there will be a proposal for file copy at the coming IETF
meeting. If you want server-to-server, then now is the time to speak up
and make the case. You'd probably want to start a thread on
nfsv4@xxxxxxxxxxx

Cheers
  Trond

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux