On Apr 6, 2021, at 10:47 PM, Damien Miller <djm@xxxxxxxxxxx> wrote: > On Tue, 6 Apr 2021, rapier wrote: >> On 4/6/21 10:04 PM, Damien Miller wrote: >>> On Tue, 6 Apr 2021, rapier wrote: >>> >>>> Looking at the performance - on my systems sftp seems to be a bit slower >>>> than scp when dealing with a lot of small files. Not sure why this is >>>> the case as I haven't looked at the sftp code in years. >>> >>> the OpenSSH sftp client doesn't do inter-file pipelining - it only >>> pipelines read/writes within a transfer, so each new file causes a >>> stall. >>> >>> This is all completely fixable on the client side, and shouldn't apply >>> to things like sshfs at all. >> >> Gotcha. Is this because of how it sequentially loops through the readdirs in >> two _dir_internal functions? > > Only partly - the client will do SSH2_FXP_READDIR to get the full list of > files and then transfer each file separately. The SSH2_FXP_READDIR are not > pipelined at all, there is no pipelining between obtaining the file list > and the file transfers. Finally each file transfer incurrs a pipeline > stall upon completion. The good news here is that from a protocol standpoint a server can already break up a READDIR response into multiple chunks. So, while there will still be a stall between READDIR calls on a directory with a very large number of files, a client can start to pipeline the transfers of those files or recursive READDIR calls for subdirectories without waiting for the entire listing of files in the parent directory to be returned, once there’s some mechanism in place to manage that work. To avoid overwhelming the server, you’ll probably want to put a cap on the number of simultaneous requests to any given server, but that can all be managed in the client. In AsyncSSH, I implemented a scandir() call that returns an async iterator of remote directory entries from READDIR that begins to return results even before the full list of file names in a directory has been returned and then used that to implement an rmtree() call on the client which parallelized recursive deletion of a remote directory tree and saw a significant speedup on trees with a large number of files/subdirectories. I haven’t yet updated my recursive file transfer client code to leverage this since there was already a good amount of parallelism on the transfers themselves, but perhaps I’ll look into doing this next. With a large number of very small files, I would expect to see some benefit from that. That said, is the SCP implementation in OpenSSH currently doing any file-level parallelization? I wouldn’t expect it to, so I’m not sure that would explain the performance difference. If I had to guess, it’s more likely due to the fact that there’s a single round-trip with SCP for each file transfer, whereas SFTP involves separate requests to do an open(), read(), stat(), etc. each of which has its own round-trip. Some of those (such as the read() calls) are parallelized, but you still have to pay for the open() before beginning the reads, and possibly for other things like stat() when preserving attributes. -- Ron Frederick ronf@xxxxxxxxxxxxx _______________________________________________ openssh-unix-dev mailing list openssh-unix-dev@xxxxxxxxxxx https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev