[ Adding random people who get blamed for lines in this remap_range thing to the participants ] On Tue, Jul 12, 2022 at 5:11 AM Ansgar Lößer <ansgar.loesser@xxxxxxxxxxxxxxx> wrote: > > using the deduplication API we found out, that the FIDEDUPERANGE ioctl > syscall can be used to read a writeonly file. So I think your patch is slightly wrong, but I think this is worth fixing - just likely differently. First off - an odd technicality: you can already read write-only files by simply mmap'ing them, because on many architectures PROT_WRITE ends up implying PROT_READ too. So you should basically expect that "I have permissions to write to the file" automatically means that you can read it too. People simply do that "open for writing, mmap to change it" and expect it to work - not realizing that that means you have to be able to read it too. Anybody who thought otherwise was sadly wrong, and if you depend on "this is write-only" as some kind of security measure for secrets, you need to fix your setup. Now, is that a "feature or a bug"? You be the judge.It is what it is, and it's always been that way. Writability trumps readability, even if you have to do special things to get there. That said, this file remap case was clearly not intentional, and despite the mmap() issue I think this is just plain wrong and we should fix it as a QoI issue. A dedupe may only write to the destination file, but at the same time it does obviously have that implication of "I need to be able to read it to see that it's duplicate". However, your patch is wrong: > --- a/fs/remap_range.c > +++ b/fs/remap_range.c > @@ -414,11 +414,11 @@ static bool allow_file_dedupe(struct file *file) > > if (capable(CAP_SYS_ADMIN)) > return true; > - if (file->f_mode & FMODE_WRITE) > + if ((file->f_mode & (FMODE_READ | FMODE_WRITE)) == (FMODE_READ | > FMODE_WRITE)) > return true; This part looks like a good idea - although it is possible that people will argue that this is the same kind of issue as 'mmap()' has (but unlike mmap, we don't have "this is how the hardware works" issues, or "long history of uses"). But > - if (!inode_permission(mnt_userns, inode, MAY_WRITE)) > + if (!inode_permission(mnt_userns, inode, MAY_READ | MAY_WRITE)) looks wrong. Note that readability is about the file *descriptor*, not the inode. Because the file descriptor may have been opened by somebody who had permission to read the file even for a write-only file. That happens for capability reasons, but it also happens for things like "open(O_RDWR | O_CREAT, 0444)" which creates a new file that is write-only in the filesystem, but despite that the file descriptor is actually readable by the opener. I wonder if the inode_permission() check should just be removed entirely (ie the MAY_WRITE check smells bogus too, for the same reason I don't like the added MAY_READ one) The file permission check - that was done at open time - is the correct one, and is the one that read/write already uses. Any permission checks done at IO time are basically always buggy: things may have changed since the 'open()', and those changes explicitly should *not* matter for the IO. That's really fundamentally how UNIX file permissions work. Linus