On Wed, Jul 13, 2022 at 07:16:14PM +0200, Ansgar Lößer wrote: > > > And the inode_permission() check is wrong, but at least it does have > > > the important check there (ie that FMODE_WRITE one). So doing the > > > inode_permissions() check at worst just makes it fail too often, but > > > since it's all a "optimistically dedupe" anyway, that kind of "fail in > > > odd situations" doesn't really matter. > > > > > > So for allow_file_dedupe(), I'd suggest: > > > > > > (a) remove the inode_permission() check in allow_file_dedupe() > > > > > > (b) remove the uid_eq() check for the same reason (if you didn't open > > > the destination for write, you have no business deduping anything, > > > even if you're the owner) > > So we're going to break userspace? > > https://github.com/markfasheh/duperemove/issues/129 > > > > I am actually not sure why writability is needed for 'dest' at all. Of > course, the deduplication might cause a write on the block device or > underlying storage, but from the abstract fs perspective, neither the data > nor any metadata can change, regardless of the success of the deduplication. Metadata is most definitely changed by a dedupe operation. Run filefrag or FIEMAP before/after and you'll see that the block mapping for at least the destination file (and maybe the source, too!) as a result of the dedupe operation. > The only attribute that might change is the position of the block on the > blockdevice. Does changing this information require write permissions? Yes. The block map is needed to access the user data, hence POSIX requires modifications to the block map be covered by fdatasync() persistence guarantees. i.e. modifying the block map of a file/inode is most definitely considered a write operation that needs a post-completion fdatasync/fsync operation to provide the modification with persistence guarantees. Indeed, we require write permissions for fallocate() operations that directly modify the block list but don't change data, too. e.g. preallocation over a hole, sparsifying a file by punching out data ranges containing only zeros, etc. These are not changing the user visible data; they only modify the underlying block map of the inode. This is exactly the same as dedupe - these operations are not modifying user visible data, they just modify the index needed to find the physical location of the user data in the filesystem. In fact, there's an fallocate() operation called FALLOC_FL_UNSHARE_RANGE, which is the exact opposite of dedupe - it takes a shared extent and explicitly breaks the sharing, copying data and changing the underlying block map of the file to do that. It doesn't change user visible data, but it most definitely changes the metadata and the data layout of the file. So, yeah, operations that directly manipulate the extent layout of the file (reflink/clone, dedupe, fallocate, etc) are most definitely modification operations. Always have been, always will be, and should always require write permissions to have been granted to the caller... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx