On Fri, Aug 05, 2016 at 11:36:14AM -0700, Mark Fasheh wrote: > On Fri, Aug 05, 2016 at 09:50:15AM +1000, Dave Chinner wrote: > > I'd much prefer that fiemap gives exact information about shared > > extents. FIEMAP is a diagnostic tool and as such we need it to > > accurately reflect the exact extent map of the inode being queried > > so we aren't mislead about the layout of the file during trouble > > shooting. > > I disagree about fiemap being a diagnostic tool. I mean it's perfectly > suitable for that task, but it has many uses outside of that. > > In duperemove at least it's used to do things like look for holes and detect > already deduped extents (via physical offset). We also use EXTENT_SHARED to > get a rough estimate of space savings though that can be done in better > ways. It's a performance sensitive area too - there's currently bugs in > btrfs regarding fiemap taking too long (and one is actually blocking a > duperemove feature). Figuring EXTENT_SHARED for btrfs in particular is a > very cpu intensive process. > > None of this is using fiemap to get physical access to an extent btw, which > is what I think you're most concerned with? No, I'm talking about the fact that FIEMAP does not reflect the current state of data in the file. e.g. there can be dirty data in the page cache over a range, but FIEMAP will report that as unwritten. If you optimise the copy to preallocate unwritten regions rahter than copy them, then you will not copy the active data and the destination will be corrupt. IOWs, it is safe to use as a query tool as long as the operations that follow have their own data integrity guarantees, such as a duperemove operation. It stands alone without the need for FIEMAP - FIEMAP is just used to optimise the search for candidate blocks, and if FIEMAP is wrong then it doesn't affect the data in the file at all - duperemove just does nothing. The problem comes when the output of FIEMAP is used to determine ranges for data accessi and retreival (e.g. sparse copies) - in these cases the output of FIEMAP is incorrect and live data is going to be missed. cp doesn't verify the data it copied to guarantee the destination is identical to the source, so it's going to sliently generate corrupt copies when these coherency problems occur. This is what makes FIEMAP a diagnostic tool - you cannot rely on the output to be valid and correct for followup operations based on that information. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html