On Wed, Jan 4, 2023 at 6:25 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > On Thu, Dec 22, 2022 at 03:41:30PM +0100, Zbigniew Halas wrote: > > On Thu, Dec 22, 2022 at 9:25 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > > Thanks for the analysis. > > > Would you be interested in trying to fix the bug and writing a test? > > > I can help if you would like. > > > > I can give it a try unless it turns out that some deep VFS changes are > > required, but let's try to narrow down the reasonable API semantics > > first. > > > > > It's hard to follow all the changes since > > > 54dbc1517237 ("vfs: hoist the btrfs deduplication ioctl to the vfs") > > > in v4.5, but it *looks* like this behavior might have been in btrfs, > > > before the ioctl was promoted to vfs.. not sure. > > > > > > We have fstests coverage for the "good" case of same size src/dst > > > (generic/136), but I didn't find a test for the non-same size src/dst. > > > > > > In any case, vfs_dedupe_file_range_one() and ->remap_file_range() > > > do not even have an interface to return the actual bytes_deduped, > > The number of bytes remapped is passed back via the loff_t return value > of ->remap_file_range. If CAN_SHORTEN is set, the VFS and the fs > implementation are allowed to reduce the @len parameter as much as they > want. TBH I'm mystified why the original btrfs dedupe ioctl wouldn't > allow deduplication of common prefixes, which means that len only gets > shortened to avoid weird problems when dealing with eof being in the > middle of a block. > > (Or not, since there's clearly a bug.) > > > > so I do not see how any of the REMAP_FILE_CAN_SHORTEN cases > > > are valid, regardless of EOF. > > > > Not sure about this, it looks to me that they are actually returning > > the number of bytes deduped, but the value is not used, but maybe I'm > > missing something. > > Anyway I think there are valid cases when REMAP_FILE_CAN_SHORTEN makes sense. > > For example if a source file content is a prefix of a destination file > > content and we want to dedup the whole range of the source file > > without REMAP_FILE_CAN_SHORTEN, > > then the ioctl will only succeed when the end of the source file is at > > the block boundary, otherwise it will just fail. This will render the > > API very inconsistent. > > <nod> I'll try to figure out where the len adjusting went wrong here and > report back. Hey Darrick, Len adjustment happens in generic_remap_checks, specifically here: if (pos_in + count == size_in && (!(remap_flags & REMAP_FILE_DEDUP) || pos_out + count == size_out)) { bcount = ALIGN(size_in, bs) - pos_in; } else { if (!IS_ALIGNED(count, bs)) count = ALIGN_DOWN(count, bs); bcount = count; } the problem is that in this particular case length is set to zero, and it makes generic_remap_file_range_prep to succeed. I think that 2 things are needed to make this API behave reasonably: * always use the original length for the file content comparison, not the truncated one (there currently is also a short circuit for zero length that skips comparison, this would need to be revisited as well) - otherwise we may report successful deduplication, despite the ranges being clearly different and only the truncated ranges being the same. * report real, possibly truncated deduplication length to the userland Cheers, Zbigniew