Re: [RFC v0 0/4] sys_copy_range() rough draft

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 14, 2013 at 02:15:22PM -0700, Zach Brown wrote:
> We've been talking about implementing some form of bulk data copy
> offloading for a while now.  BTRFS and OCFS2 implement forms of copy
> offloading with ioctls, NFS 4.2 will include a byte-granular COPY
> operation, and the SCSI XCOPY command is being implemented now that
> Windows can issue it.
> 
> In the past we've discussed promoting the ocfs2 reflink ioctl into a
> system call that would create a new file and implicitly copy the
> source data into the new file:
> https://lkml.org/lkml/2009/9/14/481
> 
> These draft patches take the simpler approach of only copying data
> between existing files.  The patches 1) make a system call out of the
> btrfs CLONE_RANGE ioctl, 2) implement the btrfs .copy_range method with
> the ioctl's guts, 3) implement the nfs .copy_range by sending a COPY
> op, and 4) serve the COPY op in nfsd by calling the .copy_range method
> again.
> 
> The nfs patch is an untested hack.  I'm happy to beat it in to shape
> but I'll need some guidance.
> 
> I'd like strong review feedback on the interfaces, here are some
> possible topics:
> 
> a) Hopefully being able to specify a portion of the data to copy will
> avoid *huge* syscall latencies and the motivation for new async
> semantics.
> 
> b) The BTRFS ioctl and nfs COPY let you specify a count of 0 to copy
> from the start offset to the end of the file.  Does anyone have a
> strong feeling about this?  I'm leaning towards not bothering with it
> in the syscall interface.
> 
> c) I chose to return partial progess in the ssize_t return code.  This
> limits the length of the range and the size_t count argument can be too
> large and return errors, much like other io syscalls.  This seemed
> less awful than some extra argument with a pointer to a status value.
> 
> d) I'm dreading mentioning a vector of ranges to copy in one syscall
> because I don't want to think about overlaping ranges and file systems
> that use range locks -- xfs for now, but more if Jan gets his way.

XFS doesn't use range locks (yet).

> I'd rather that we get some experience with this simpler syscall before
> taking on that headache.
> 
> I'm sure I'm forgetting some other details.
> 
> I'm going to keep hacking away at this.  My next step is to get ext4
> supporting .copy_range, probably with a quick hack to copy the
> contents of bios.  Hopefully that'll give enough time to also integrate
> review feedback.

Wouldn't the easiest "support all filesystems" hack just be to add
a destination offset parameter to do_splice_direct() and call that
when the filesystem doesn't supply a ->copy_range method? i.e. use
the mechanisms we already have for copying from one file to another
via the page cache as efficiently as possible?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux