Re: [RFC][PATCH v2 4/4] vfs: wrap write f_ops with file_{start,end}_write()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 27, 2017 at 9:30 PM, Michael Kerrisk (man-pages)
<mtk.manpages@xxxxxxxxx> wrote:
> On 01/28/2017 05:20 AM, Amir Goldstein wrote:
>> On Fri, Jan 27, 2017 at 1:50 PM, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>>> On Fri, Jan 27, 2017 at 1:09 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>>>> On Mon, Jan 23, 2017 at 8:43 PM, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>>>>> Before calling write f_ops, call file_start_write() instead
>>>>> of sb_start_write().
>>>>>
>>>>> This ensures freeze protection for both overlay and upper fs
>>>>> when file is open from an overlayfs mount.
>>>>>
>>>>> Replace {sb,file}_start_write() for {copy,clone}_file_range() and
>>>>> for fallocate().
>>>>>
>>>>> For dedup_file_range() there is no need for mnt_want_write_file().
>>>>> File is already open for write, so we already have mnt_want_write()
>>>>> and we only need file_start_write().
>>>>
>>>> Being opened for write is not verified if capable(CAP_SYS_ADMIN).
>>>> Ugly special case, don't ask me why it's done...
>>>>
>>>
>>> Christoph, Darrick, is that by design?
>>
>> Anyway, whether is makes sense or not, that's a legacy from
>> BTRFS_IOC_FILE_EXTENT_SAME, we probably have to live with.
>>
>> Michael, I recon man page needs updating.
>
> Sorry--can you elaborate on what changes are required?
>

1. IOCTL-FIDEDUPERANGE(2)

"During the call, src_fd must be open for reading and dest_fd must be
open for writing"

Apparently, with CAP_SYS_ADMIN dest_fd is allowed to be open for read-only
<eyebrows raised>. It makes some sense, because the data of dest_fd is not being
modified. Its not really clear to me why admin would need this capability in the
first place, but that's the way it is.

2. copy_file_range(2)

It doesn't say anywhere that fd_in and fd_out are supposed to be regular files
and that EINVAL/EISDIR can be returns if they are not, but that is
probably the behavior
of all file systems.
Per Christoph's suggestion, I am going to enforce IS_REG in the vfs helper,
which is consistent with behavior of clone_file_range and dedupe_file_range

3. fallocate(2)

The entire man page says nothing about what to expect from fallocate
of directory,
but the error codes document:
"ENODEV fd does not refer to a regular file or a directory"

In reality, file systems return EBADF for directory fd and the vfs
helper code says:
        /*
         * Let individual file system decide if it supports preallocation
         * for directories or not.
         */
        if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode) &&
            !S_ISBLK(inode->i_mode))
                return -ENODEV;

IS_BLK was added quite recently so man page is out of date is that regard.

And I really wonder if there is any file system that "decides to support
preallocation for directories"?
Because it would make life easier for me to fix correctness for freeze
protection of fallocate if vfs helper would deny falloacte a directory.

fallocate() seems to be the only operation, theoretically allowed on
a directory file descriptor, but only if open for write.
Is there any other beast of this sort?
If not, and no fs ever implemented this bizar functionality, then
perhaps it is best to get rid of that corner case.
--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux