Re: Parallel transfers with sftp (call for testing / advice)

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

 



Le 06/05/2020 à 06:21, David Newall a écrit :
Did anything happen after https://daniel.haxx.se/blog/2010/12/08/making-sftp-transfers-fast/? I suspect it did, because we do now allow multiple outstanding packets, as well as specifying the buffer size.

Daniel explained the process that SFTP uses quite clearly, such that I'm not sure why re-assembly is an issue.  He explained that each transfer already specifies the offset within the file.  It seems reasonable that multiple writers would just each write to the same file at their various different offsets.  It relies on the target supporting sparse files, but supercomputers only ever run Linux ;-) which does do the right thing.
You are right, reassembly is not an issue, as long as you have sparse files support, which is our case with Linux :)

The original patch which we are discussing seemed more concerned about being able to connect to multiple IP addresses, rather than multiple connections between the same pair of machines.  The issue, as I understand, is that the supercomputer has slow NICs, so adding multiple NICs allows greater network bandwidth.  This, I think, is the problem to be solved; not re-assembly, just sending to what appear to be multiple different hosts (i.e. IP addresses.)

No, the primary goal of the patch is to enable to do that between two endpoints with one NIC per endpoint, the NIC being 10GE or faster.

Here is an example with roughly the same results for a single destination/IP :


# With the patched sftp and 1+10 parallel SSH connections

[me@france openssh-portable]$ ./sftp -n 10 germany0
Connected main channel to germany0 (1.2.3.96).
Connected channel 1 to germany0 (1.2.3.96).
Connected channel 2 to germany0 (1.2.3.96).
Connected channel 3 to germany0 (1.2.3.96).
Connected channel 4 to germany0 (1.2.3.96).
Connected channel 5 to germany0 (1.2.3.96).
Connected channel 6 to germany0 (1.2.3.96).
Connected channel 7 to germany0 (1.2.3.96).
Connected channel 8 to germany0 (1.2.3.96).
Connected channel 9 to germany0 (1.2.3.96).
Connected channel 10 to germany0 (1.2.3.96).
sftp>  get 5g 5g.bis
Fetching /files/5g to 5g.bis
/files/5g 100% 5120MB 706.7MB/s   00:07
sftp> put 5g.bis

Uploading 5g.bis to /files/5g.bis
5g.bis 100% 5120MB 664.0MB/s   00:07
sftp>


# WIth the legacy sftp :

[me@france openssh-portable]$ sftp germany0

sftp> get 5g 5g.bis
Fetching /files/5g to 5g.bis
/p/scratch/chpsadm/files/5g 100% 5120MB  82.8MB/s   01:01
sftp> put 5g.bis
Uploading 5g.bis to /files/5g.bis
5g.bis 100% 5120MB  67.0MB/s   01:16
sftp>


# With scp :

[me@france openssh-portable]$ scp 5g germany0:/files/5g.bis
5g 100% 5120MB  83.1MB/s   01:01


#With rsync :

[me@france openssh-portable]$ rsync -v 5g germany0:/files/5g.bis

5g

sent 5,370,019,908 bytes  received 35 bytes  85,920,319.09 bytes/sec
total size is 5,368,709,120  speedup is 1.00



I was curious to know why a supercomputer would have issues receiving at some high-bandwidth via a single NIC, while the sending machine has no such performance issue; but that's an aside.

Supercomputers commonly offer multiple "login nodes" and a generic DNS entry to connect to one of them randomly : the DNS entry is associated to multiple IP adresses and the client (dns resolver) selects one of them.

Other DNS entries may exist to address a particular login node, in case you want to go at a particular place.

When used with Cyril 's patched sftp, this logic makes that you are targeting multiple hosts automatically if you use the generic DNS entry (the first perf results of Cyril). If you select a particular host DNS entry (like in this exampe), then you will only contact that single host only.

On supercomputers, files are commonly stored on distributed file systems like NFS, Lustre, GPFS, ... In case your transfers target one of those types of file systems, you can use multiple hosts as destinations without any issues. You just need to ensure that the sftp sent/written blocks are properly sized to avoid any overwritting of some targets by others because of their file systems client implementations and the asynchronism of the various page cache flushes on the involved nodes. That is what is done in the patch, as explained by Cyril in a previous message, the block size used for parallel transfers was selected with that potential issue in mind.


Regards,

Matthieu


_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@xxxxxxxxxxx
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev


_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@xxxxxxxxxxx
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev




[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux