On Wed, Jan 29, 2020 at 7:54 PM Darrick J. Wong <darrick.wong@xxxxxxxxxx> wrote: > > On Wed, Jan 22, 2020 at 05:13:53PM -0600, Steve French wrote: > > As discussed last year: > > > > Current Linux copy tools have various problems compared to other > > platforms - small I/O sizes (and most don't allow it to be > > configured), lack of parallel I/O for multi-file copies, inability to > > reduce metadata updates by setting file size first, lack of cross > > ...and yet weirdly we tell everyone on xfs not to do that or to use > fallocate, so that delayed speculative allocation can do its thing. > We also tell them not to create deep directory trees because xfs isn't > ext4. Delayed speculative allocation may help xfs but changing file size thousands of times for network and cluster fs for a single file copy can be a disaster for other file systems (due to the excessive cost it adds to metadata sync time) - so there are file systems where setting the file size first can help > > And copy tools rely less on > > the kernel file system (vs. code in the user space tool) in Linux than > > would be expected, in order to determine which optimizations to use. > > What kernel interfaces would we expect userspace to use to figure out > the confusing mess of optimizations? :) copy_file_range and clone_file_range are a good start ... few tools use them ... > There's a whole bunch of xfs ioctls like dioinfo and the like that we > ought to push to statx too. Is that an example of what you mean? That is a good example. And then getting tools to use these, even if there are some file system dependent cases. > > > But some progress has been made since last year's summit, with new > > copy tools being released and improvements to some of the kernel file > > systems, and also some additional feedback on lwn and on the mailing > > lists. In addition these discussions have prompted additional > > feedback on how to improve file backup/restore scenarios (e.g. to > > mounts to the cloud from local Linux systems) which require preserving > > more timestamps, ACLs and metadata, and preserving them efficiently. > > I suppose it would be useful to think a little more about cross-device > fs copies considering that the "devices" can be VM block devs backed by > files on a filesystem that supports reflink. I have no idea how you > manage that sanely though. I trust XFS and BTRFS and SMB3 and cluster fs etc. to solve this better than the block level (better locking, leases/delegation, state management, etc.) though. -- Thanks, Steve