Re: [PATCH] fuse: update size attr before doing IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/12/24 19:18, Sweet Tea Dorminy wrote:
> 
> 
> On 3/11/24 06:01, Miklos Szeredi wrote:
>> On Thu, 7 Mar 2024 at 16:10, Sweet Tea Dorminy
>> <sweettea-kernel@xxxxxxxxxx> wrote:
>>>
>>> All calls into generic vfs functions need to make sure that the inode
>>> attributes used by those functions are up to date, by calling
>>> fuse_update_attributes() as appropriate.
>>>
>>> generic_write_checks() accesses inode size in order to get the
>>> appropriate file offset for files opened with O_APPEND. Currently, in
>>> some cases, fuse_update_attributes() is not called before
>>> generic_write_checks(), potentially resulting in corruption/overwrite of
>>> previously appended data if i_size is out of date in the cached inode.
>>
>> While this all sounds good, I don't think it makes sense.
>>
>> Why?  Because doing cached O_APPEND writes without any sort of
>> exclusion with remote writes is just not going to work.
>>
>> Either the server ignores the current size and writes at the offset
>> that the kernel supplied (which will be the cached size of the file)
>> and executes the write at that position, or it appends the write to
>> the current EOF.  In the former case the cache will be consistent, but
>> append semantics are not observed, while in the latter case the append
>> semantics are observed, but the cache will be inconsistent.
>>
>> Solution: either exclude remote writes or don't use the cache.
>>
>> Updating the file size before the write does not prevent the race,
>> only makes the window smaller.
> 
> Definitely agree with you.
> 
> The usecase at hand is a sort of NFS-like network filesystem, where
> there's exclusion of remote writes while the file is open, but no
> problem with remote writes while the file is closed.
> 
> The alternative we considered was to add a fuse_update_attributes() call
> to open.
> 
> We thought about doing so during d_revalidate/lookup_fast(). But as far
> as I understand, lookup_fast() is not just called during open, and will
> use the cached inode if the dentry timeout hasn't expired. We tried
> setting dentry timeout to 0, but that lost too much performance. So that
> didn't seem to work.
> 
> But updating attributes after giving the filesystem a chance to
> invalidate them during open() would work, I think?

You mean something like this?

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index d19cbf34c634..2723270323d9 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -204,7 +204,7 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
        if (inode && fuse_is_bad(inode))
                goto invalid;
        else if (time_before64(fuse_dentry_time(entry), get_jiffies_64()) ||
-                (flags & (LOOKUP_EXCL | LOOKUP_REVAL | LOOKUP_RENAME_TARGET))) {
+                (flags & (LOOKUP_EXCL | LOOKUP_REVAL | LOOKUP_RENAME_TARGET | LOOKUP_OPEN))) {
                struct fuse_entry_out outarg;
                FUSE_ARGS(args);
                struct fuse_forget_link *forget;


I think this would make sense and could be caught by the atomic-open/revalidate
(once I get back to it).



> 
> That would also conveniently fix the issue where copy_file_range()
> currently checks the size before calling into fuse at all, which I'd
> been building a more elaborate changeset for.
> 
> How does that sound?
> 
> Thanks!
> 
> Sweet Tea
> 




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux