Here are the new numbers using latest SSC code for an 8GB copy. The code has a delayed unmount on the destination server which allows for single mount when multiple COPY calls are made back to back. Also, there is a third option which is using ioctl with a 64 bit copy length in order to issue a single call for copy lengths >= 4GB. Setup: Client: 16 CPUs, 32GB SRC server: 4 CPUs, 8GB DST server: 4 CPUs, 8GB Traditional copy: DBG2: 20:31:43.683595 - Traditional COPY returns 8589934590 (96.8432810307 seconds) SSC (2 copy_file_range calls back to back): DBG2: 20:30:00.268203 - Server-side COPY returns 8589934590 (83.0517759323 seconds) PASS: SSC should outperform traditional copy, performance improvement for a 8GB file: 16% SSC (2 copy_file_range calls in parallel): DBG2: 20:34:49.686573 - Server-side COPY returns 8589934590 (79.3080010414 seconds) PASS: SSC should outperform traditional copy, performance improvement for a 8GB file: 20% SSC (1 ioctl call): DBG2: 20:38:41.323774 - Server-side COPY returns 8589934590 (74.7774350643 seconds) PASS: SSC should outperform traditional copy, performance improvement for a 8GB file: 28% Since I don’t have three similar systems to test with, having the best machine (more cpu’s and more memory) as the client gives a better performance for the traditional copy. The following results are done using the best machine as the destination server instead. Setup (using the best machine as the destination server instead): Client: 4 CPUs, 8GB SRC server: 4 CPUs, 8GB DST server: 16 CPUs, 32GB Traditional copy: DBG2: 21:52:15.039625 - Traditional COPY returns 8589934590 (178.686635971 seconds) SSC (2 copy_file_range calls back to back): DBG2: 21:49:08.961384 - Server-side COPY returns 8589934590 (173.071172953 seconds) PASS: SSC should outperform traditional copy, performance improvement for a 8GB file: 3% SSC (2 copy_file_range calls in parallel): DBG2: 21:35:59.822467 - Server-side COPY returns 8589934590 (159.743849993 seconds) PASS: SSC should outperform traditional copy, performance improvement for a 8GB file: 18% SSC (1 ioctl call): DBG2: 21:28:33.461528 - Server-side COPY returns 8589934590 (83.9983980656 seconds) PASS: SSC should outperform traditional copy, performance improvement for a 8GB file: 119% As you can see a single 8GB copy (ioctl with 64 bit copy length) performs the same as before (about 80 seconds) but in this case the traditional copy takes a lot longer. --Jorge On 4/18/17, 12:33 PM, "linux-nfs-owner@xxxxxxxxxxxxxxx on behalf of J. Bruce Fields" <linux-nfs-owner@xxxxxxxxxxxxxxx on behalf of bfields@xxxxxxxxxxxx> wrote: On Tue, Apr 18, 2017 at 01:28:39PM -0400, Olga Kornievskaia wrote: > Given how the code is written now it looks like it's not possible to > save up commits.... > > Here's what I can see happening: > > nfs42_proc_clone() as well as nfs42_proc_copy() will call > nfs_sync_inode(dst) "to make sure server(s) have the latest data" > prior to initiating the clone/copy. So even if we just queue up (not > send) the commit after the executing nfs42_proc_copy, then next call > into vfs_copy_file_range() will send out that queued up commit. > > Is it ok to relax the requirement that requirement, I'm not sure... Well, if the typical case of copy_file_range is just opening a file, doing a single big copy_file_range(), then closing the file, then this doesn't matter. The linux server is currently limiting COPY to 4MB at a time, which will make the commits more annoying. Even there the typical case will probably still be an open, followed by a series of non-overlapping copies, then close, and that shouldn't require the commits. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥