Hi all, This is the second revision of an RFC for adding to XFS kernel support for mapping multiple file logical blocks to the same physical block, more commonly known as reflinking. The implementation a single [block range, refcount] tree to track the reference counts of extents of physical blocks. There's also support code to provide the desired copy-on-write behavior and the userland interfaces to reflink, query the status of, and un-reflink files. The patch set is based on the current (4.2-rc4) upstream kernel plus Dave's reverse-map RFC patches. There are plenty of bugs in this code; in particular the copy-on-write code is still terrible and prone to all sorts of amusing crashes. To expand on that, the copy on write code is horribly broken, but I'm posting this patchset in the hopes of getting some review of the other pieces while I try to solve CoW. Since "RFC(RAP)" post last month I broke up the patches into smaller pieces, added tracepoints, and provided longer descriptions + ASCII art of what the big algorithms are trying to do. What I'd like to do for CoW is to (ab|re)use the delayed allocation code to implement copy on write. In xfs_get_blocks we'd reserve whatever blocks we need (or return ENOSPC to users) as in regular delalloc; and in xfs_vm_writepage we'd use xfs_map_blocks to allocate the forked blocks, remove the old mapping, and add in the new mapping, which is almost what delalloc does now. One problem I've not yet worked around is that __block_write_begin won't call get_blocks if the bh is already mapped, which means that we fail to make the necessary reservations in certain cases (write file, reflink, rewrite original file). The current CoW patch sort of forces this to work by doing its own reservation outside of get_blocks and delalloc, but doesn't necessarily get it right. At the moment, the reverse-map and reflink features are /not/ compatible. This will be resolved soon. The ioctl interface to XFS reflink looks surprisingly like the btrfs ioctl interface <cough> -- you can reflink a file, reflink subranges of a file, or dedupe subranges of files. (Dedupe also checks file blocks, though I have a feeling it's racy.) To un-reflink a file, simply chattr +C it to mark it no-cow. xfs_fsr is a better candidate for de-reflinking a file since it also defragments the file. If you're going to start using this mess, you're going to want to pull my xfsprogs dev tree[1], which itself is also based on xfsprogs for-next and the userland rmap support bits. I've not had time to get reflink and rmap to work together. I've also prepared a bunch of xfstests[2] to exercise the userland interfaces; btrfs' reflink implementation more or less passes. This is an extraordinary way to eat your data. Enjoy! Comments and questions are, as always, welcome. --D [1] https://github.com/djwong/xfsprogs/commits/for-next [2] https://github.com/djwong/xfstests/commits/master _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs