On Feb 8, 2019, at 4:56 PM, Steve French <smfrench@xxxxxxxxx> wrote: > > On Fri, Feb 8, 2019 at 5:03 PM Steve French <smfrench@xxxxxxxxx> wrote: >> >> On Fri, Feb 8, 2019 at 4:37 PM Andreas Dilger <adilger@xxxxxxxxx> wrote: >>> >>> On Feb 8, 2019, at 8:19 AM, Steve French <smfrench@xxxxxxxxx> wrote: >>>> >>>> Current Linux copy tools have various problems compared to other >>>> platforms - small I/O sizes (and not even configurable for most), >>> >>> Hmm, this comment puzzles me, since "cp" already uses s_blksize >>> returned for the file as the IO size? Not sure if tar/rsync do >>> the same, but if they don't already use s_blksize they should. > > I did some experiments changing the block size returned from 1K to 64K to 1MB > and see no difference in the copy size used by cp (it was always 128K in all > the cases when caching is disabled) Strange. I just re-tested this on Lustre, in case something had changed in GNU fileutils that I didn't notice, and it worked fine for me, using both "cp --version = 8.4" on RHEL and "cp --version = 8.26" on Ubuntu: $ dd if=/dev/urandom of=/tmp/foo bs=1M count=12 $ strace -v cp /tmp/foo /testfs/tmp : open("/tmp/foo", O_RDONLY) = 3 fstat(3, {... st_blksize=4096, st_blocks=24576, st_size=12582912, ...}) = 0 open("/testfs/tmp/foo", O_WRONLY|O_CREAT|O_EXCL, 0664) = 4 fstat(4, { ... st_blksize=4194304, st_blocks=0, st_size=0, ...}) = 0 read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304 write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304 : Note the "st_blksize=4194304" for the target file returned by Lustre matches the read and write buffer size used by "cp". The same is true if Lustre is the source file and not the target, so it probably picks the maximum of both: open("/testfs/tmp/foo", O_RDONLY) = 3 fstat(3, {... st_blksize=4194304, st_blocks=24576, st_size=12582912 ...}) = 0 open("/tmp/bar", O_WRONLY|O_TRUNC) = 4 fstat(4, {... st_blksize=4096, st_blocks=0, st_size=0 ...}) = 0 read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304 write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304 : Running the same command with /tmp as the target uses a smaller buffer size matching the "st_blocks=32768" and correspondingly more read/write calls: $ strace -v cp /tmp/foo /tmp/baz : open("/tmp/baz", O_WRONLY|O_CREAT|O_EXCL, 0664) = 4 fstat(4, {... st_blksize=4096, st_blocks=0, st_size=0, ...}) = 0 read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 32768) = 32768 write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 32768) = 32768 : In this case, cp probably has some minimum buffer size it uses to avoid the poor performance of using 4KB blocks. Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP