Re: fuse: incorrect attribute caching with writeback cache disabled

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 12, 2022 at 6:58 PM Frank Dinoff <fdinoff@xxxxxxxxxx> wrote:
>
> On Fri, Aug 12, 2022 at 5:33 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> >
> > On Thu, 11 Aug 2022 at 23:05, Frank Dinoff <fdinoff@xxxxxxxxxx> wrote:
> > >
> > > I have a binary running on a fuse filesystem which is generating a zip file. I
> > > don't know what syscalls are involved since the binary segfaults when run with
> > > strace.
> >
> > You could strace the fuse filesystem.
>
> I'll try doing this later, I was unsuccessful in finding anything
> useful printing large amounts
> of debug logs.

I got strace working on the program. It looks like it doing something like

open(O_RDWR) = 9
multiple write(...) calls such that the lseek below is before end of file.
lseek(9, 2514944, SEEK_SET)             = 2514944
read(9, "", 8192)                       = 0 // Should have read 5770 bytes
lseek(9, 5770, SEEK_CUR)                = 2520714 // should be end.
write(...)
close(9)
open(O_RDWR) = 9
lseek(9, 2514944, SEEK_SET)             = 2514944
read(9, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
6042) = 6042
...

The first read doesn't return data and I'm not sure why. It is kinda
like the kernel page cache has gotten out of sync and thinks the whole
file should be zeros.

>
> >
> > > After doing a binary search,
> > > https://github.com/torvalds/linux/commit/fa5eee57e33e79b71b40e6950c29cc46f5cc5cb7
> > > is the commit that seems to have introduced the error. It still seems to
> > > failing with a much newer kernel.
> >
> > How is it failing?
>
> Oops sorry I thought I included that.  You can't unzip the file.
> unzip -t has "error:  invalid compressed data to inflate"
>
> > > Reverting the fuse_invalidate_attr_mask in fuse_perform_write to
> > > fuse_invalidate_attr makes every other run of the binary produce the correct
> > > output.
> >
> > What do you mean?  Is it succeeding half the time?
>
> Running the binary multiple times in a row about 50% produce the
> correct file and 50%
> produce a corrupt file.
>
> Running the test multiple times before fa5eee57 I'm seeing about 10%
> of runs producing
> a corrupt file. (I did not realize this had a chance of failure on the
> old kernel.)
> After fa5eee57 I have 100% of runs producing the corrupt file.
>
> >
> > >
> > > I found that enabling the writeback cache makes the binary always produce the
> > > right output. Running the fuse daemon in single threaded mode also works.
> > >
> > > Is there anything that sticks out to you that is wrong with the above commit?
> >
> > Could you try adding STATX_MODE to the invalidated mask?   Can't
> > imagine any other attribute being relevant.
>
> Adding STATX_MODE to FUSE_STATX_MODIFY does make the binary produce the
> correct file about 75% of the time. The last bit of flakiness may be
> some concurrency
> issue in the binary?
>
> >
> > Thanks,
> > Miklos



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux