On Mon, 2013-02-25 at 15:35 -0800, Andy Lutomirski wrote: > On Mon, Feb 25, 2013 at 3:28 PM, Myklebust, Trond > <Trond.Myklebust@xxxxxxxxxx> wrote: > > On Mon, 2013-02-25 at 14:16 -0800, Andy Lutomirski wrote: > >> On Mon, Feb 25, 2013 at 1:59 PM, Myklebust, Trond > >> <Trond.Myklebust@xxxxxxxxxx> wrote: > >> > On Mon, 2013-02-25 at 16:49 -0500, Ric Wheeler wrote: > >> >> On 02/25/2013 04:14 PM, Andy Lutomirski wrote: > >> >> > On 02/21/2013 02:24 PM, Zach Brown wrote: > >> >> >> On Thu, Feb 21, 2013 at 08:50:27PM +0000, Myklebust, Trond wrote: > >> >> >>> On Thu, 2013-02-21 at 21:00 +0100, Paolo Bonzini wrote: > >> >> >>>> Il 21/02/2013 15:57, Ric Wheeler ha scritto: > >> >> >>>>>> sendfile64() pretty much already has the right arguments for a > >> >> >>>>>> "copyfile", however it would be nice to add a 'flags' parameter: the > >> >> >>>>>> NFSv4.2 version would use that to specify whether or not to copy file > >> >> >>>>>> metadata. > >> >> >>>>> That would seem to be enough to me and has the advantage that it is an > >> >> >>>>> relatively obvious extension to something that is at least not totally > >> >> >>>>> unknown to developers. > >> >> >>>>> > >> >> >>>>> Do we need more than that for non-NFS paths I wonder? What does reflink > >> >> >>>>> need or the SCSI mechanism? > >> >> >>>> For virt we would like to be able to specify arbitrary block ranges. > >> >> >>>> Copying an entire file helps some copy operations like storage > >> >> >>>> migration. However, it is not enough to convert the guest's offloaded > >> >> >>>> copies to host-side offloaded copies. > >> >> >>> So how would a system call based on sendfile64() plus my flag parameter > >> >> >>> prevent an underlying implementation from meeting your criterion? > >> >> >> If I'm guessing correctly, sendfile64()+flags would be annoying because > >> >> >> it's missing an out_fd_offset. The host will want to offload the > >> >> >> guest's copies by calling sendfile on block ranges of a guest disk image > >> >> >> file that correspond to the mappings of the in and out files in the > >> >> >> guest. > >> >> >> > >> >> >> You could make it work with some locking and out_fd seeking to set the > >> >> >> write offset before calling sendfile64()+flags, but ugh. > >> >> >> > >> >> >> ssize_t sendfile(int out_fd, int in_fd, off_t in_offset, off_t > >> >> >> out_offset, size_t count, int flags); > >> >> >> > >> >> >> That seems closer. > >> >> >> > >> >> >> We might also want to pre-emptively offer iovs instead of offsets, > >> >> >> because that's the very first thing that's going to be requested after > >> >> >> people prototype having to iterate calling sendfile() for each > >> >> >> contiguous copy region. > >> >> > I thought the first thing people would ask for is to atomically create a > >> >> > new file and copy the old file into it (at least on local file systems). > >> >> > The idea is that nothing should see an empty destination file, either > >> >> > by race or by crash. (This feature would perhaps be described as a > >> >> > pony, but it should be implementable.) > >> >> > > >> >> > This would be like a better link(2). > >> >> > > >> >> > --Andy > >> >> > >> >> Why would this need to be atomic? That would seem to be a very difficult > >> >> property to provide across all target types with multi-GB sized files... > >> > > >> > Right. It may sound cool, but what's the real-life use case? > >> > > >> > >> Download file from some source and then verify it. Now copyfile it > >> into my repository of known-good files. > >> > >> Admittedly I could link + unlink or rename it there, but I consider > >> hard links to be rather evil, especially when cow links are available. > > > > Rename is the right way to do that as it can't corrupt the data after > > you have verified it. copyfile can... > > ...copyfile doesn't exist. Wrong! The underlying NFS and SCSI copy offload protocols are fully defined at this time, and will constrain any implementation that you may dream up. > I think it would be neat if it couldn't > corrupt data. It would also be neat if the moon were made of cheese... The underlying NFS and SCSI protocols do not guarantee perfect copies; the copy may, for instance, be interrupted due to external circumstances. > In any case, this may be a bad idea -- presumably you'd have to fsync > the file you're copying *from* first to avoid a massive performance > hit. You have to do that anyway. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html