On Mon, Dec 16, 2019 at 6:28 PM <fdmanana@xxxxxxxxxx> wrote: > > From: Filipe Manana <fdmanana@xxxxxxxx> > > We always round down, to a multiple of the filesystem's block size, the > length to deduplicate at generic_remap_check_len(). However this is only > needed if an attempt to deduplicate the last block into the middle of the > destination file is requested, since that leads into a corruption if the > length of the source file is not block size aligned. When an attempt to > deduplicate the last block into the end of the destination file is > requested, we should allow it because it is safe to do it - there's no > stale data exposure and we are prepared to compare the data ranges for > a length not aligned to the block (or page) size - in fact we even do > the data compare before adjusting the deduplication length. > > After btrfs was updated to use the generic helpers from VFS (by commit > 34a28e3d77535e ("Btrfs: use generic_remap_file_range_prep() for cloning > and deduplication")) we started to have user reports of deduplication > not reflinking the last block anymore, and whence users getting lower > deduplication scores. The main use case is deduplication of entire > files that have a size not aligned to the block size of the filesystem. > > We already allow cloning the last block to the end (and beyond) of the > destination file, so allow for deduplication as well. > > Link: https://lore.kernel.org/linux-btrfs/2019-1576167349.500456@xxxxxxxxxxxxxx/ > Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx> Darrick, Al, any feedback? Thanks. > --- > fs/read_write.c | 10 ++++------ > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/fs/read_write.c b/fs/read_write.c > index 5bbf587f5bc1..7458fccc59e1 100644 > --- a/fs/read_write.c > +++ b/fs/read_write.c > @@ -1777,10 +1777,9 @@ static int remap_verify_area(struct file *file, loff_t pos, loff_t len, > * else. Assume that the offsets have already been checked for block > * alignment. > * > - * For deduplication we always scale down to the previous block because we > - * can't meaningfully compare post-EOF contents. > - * > - * For clone we only link a partial EOF block above the destination file's EOF. > + * For clone we only link a partial EOF block above or at the destination file's > + * EOF. For deduplication we accept a partial EOF block only if it ends at the > + * destination file's EOF (can not link it into the middle of a file). > * > * Shorten the request if possible. > */ > @@ -1796,8 +1795,7 @@ static int generic_remap_check_len(struct inode *inode_in, > if ((*len & blkmask) == 0) > return 0; > > - if ((remap_flags & REMAP_FILE_DEDUP) || > - pos_out + *len < i_size_read(inode_out)) > + if (pos_out + *len < i_size_read(inode_out)) > new_len &= ~blkmask; > > if (new_len == *len) > -- > 2.11.0 >