On Tue, Mar 19, 2013 at 11:29:41AM +0100, Miklos Szeredi wrote: > Copy up is a once-in-a-lifetime event for an object. Optimizing it is > way down in the list of things to do. I'd drop splice in a jiffy if > it's in the way. What makes you think that write is any better? Same deadlock there - check generic_file_aio_write(), it calls the same sb_start_write()... IOW, switching from splice to write won't help at all. > Much more interesting question: what happens if we crash during a > rename? Whiteout implemented in the filesystem won't save us. And > the results are interesting: old versions of files become visible and > similar fun. Far from likely to happen, but ... > > Add a rename-with-whiteout primitive on filesystems? That one is not > going to be as simple as plain whiteout. Or? Umm... If/when we start caring about that kind of atomicity (and I agree that we ought to) overlayfs approach to whiteouts will actually have much harder time - it doesn't take much to teach a journalling fs how to do that kind of ->rename() in a single transaction; the only question is how to tell it that we want to leave a whiteout behind us. Hell knows; one variant is to add a flag, of course. Another might be more interesting - we want some kind of "directory is opaque" flag, so if we start reshuffling the methods, we might try to merge unlink/rmdir/whiteout. Rules: * victim is negative => create a whiteout * victim is a directory, parent opaque => rmdir * victim is a non-directory, parent opaque => unlink * victim is positive, parent _not_ opaque => replace with whiteout * old_dir in case of ->rename() is opaque => normal rename * old_dir in case of ->rename() is not opaque => leave whiteout behind Non-unioned => opaque, of course (nothing showing through it). Getting good behaviour on rename interrupted by crash is going to be _very_ tricky with any strategy other than whiteouts-in-fs, AFAICS. Again, I have no problem whatsoever with changing the set of directory methods, as long as the replacement is sane. We'd done that kind of thing before and it's not a problem. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html