On Sat, Sep 28, 2013 at 11:20 PM, Ric Wheeler <rwheeler@xxxxxxxxxx> wrote: >>> I don't see the safety argument very compelling either. There are real >>> semantic differences, however: ENOSPC on a write to a >>> (apparentlíy) already allocated block. That could be a bit unexpected. >>> Do we >>> need a fallocate extension to deal with shared blocks? >> >> The above has been the case for all enterprise storage arrays ever since >> the invention of snapshots. The NFSv4.2 spec does allow you to set a >> per-file attribute that causes the storage server to always preallocate >> enough buffers to guarantee that you can rewrite the entire file, however >> the fact that we've lived without it for said 20 years leads me to believe >> that demand for it is going to be limited. I haven't put it top of the list >> of features we care to implement... >> >> Cheers, >> Trond > > > I agree - this has been common behaviour for a very long time in the array > space. Even without an array, this is the same as overwriting a block in > btrfs or any file system with a read-write LVM snapshot. Okay, I'm convinced. So I suggest - mount(..., MNT_REFLINK): *allow* splice to reflink. If this is not set, fall back to page cache copy. - splice(... SPLICE_REFLINK): fail non-reflink copy. With this app can force reflink. Both are trivial to implement and make sure that no backward incompatibility surprises happen. My other worry is about interruptibility/restartability. Ideas? What happens on splice(from, to, 4G) and it's a non-reflink copy? Can the page cache copy be made restartable? Or should splice() be allowed to return a short count? What happens on (non-reflink) remote copies and huge request sizes? Thanks, Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html