On 4/7/21 10:41 AM, Ron Frederick wrote:
That said, is the SCP implementation in OpenSSH currently doing any file-level parallelization? I wouldn’t expect it to, so I’m not sure that would explain the performance difference. If I had to guess, it’s more likely due to the fact that there’s a single round-trip with SCP for each file transfer, whereas SFTP involves separate requests to do an open(), read(), stat(), etc. each of which has its own round-trip. Some of those (such as the read() calls) are parallelized, but you still have to pay for the open() before beginning the reads, and possibly for other things like stat() when preserving attributes.
No parallelization at all. It's something I thought about but it's something I'll have to come back to when I have time. There are other deliverables for this project I need to focus on. As for the number of RTs - there are a couple of message round trips but nothing all that much. The resume feature increases the number of RTs but it's still faster.
I absolutely agree with Damien about the pipeline stalling being the major factor. Anyway, I've been looking at learning more about pipelining. :)
In some cases there *might* be an issue with hitting the outstanding message request limit but that's not what's happening here. I really do want to take a closer look at this - especially if SCP is going to default to the SFTP protocol soon. In the high performance computing community we do have faster transport tools like GridFTP and Aspera but they have some serious barriers to entry for a lot of users. SCP is still widely used for transferring large data sets (people moving TBs of data via SCP isn't uncommon where I work) so performance in those environments is a concern of mine.
_______________________________________________ openssh-unix-dev mailing list openssh-unix-dev@xxxxxxxxxxx https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev