Re: PROBLEM: 2.6.35.7 to 3.0 Inotify events missing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sylvain Rochet wrote:
> Hi,
> 
> On Tue, Oct 19, 2010 at 12:35:40AM +0200, Sylvain Rochet wrote:
> > 
> > ... upgraded to 2.6.33.5, then 2.6.33.7, finally to 2.6.35.7, and I 
> > always end up with the same ending, it seems inotify can miss some VFS 
> > events from time to time.
> 
> I finally find out why.
> 
> The NFS server does not always know the name of the modified file, if 
> the modified inode was cleared from the VFS cache fsnotify does not know 
> as well the filename then inotify child events on directories are 
> silently tossed.
> 
> Easy way to reproduce:
> 
> Add a few printk debug (here it only works if /data is the NFS export):
> 
> --- begin//fs/nfsd/vfs.c        2011-07-22 04:17:23.000000000 +0200
> +++ linux-3.0/fs/nfsd/vfs.c     2011-07-30 03:18:17.837560809 +0200
> @@ -975,6 +975,8 @@
>         inode = dentry->d_inode;
>         exp   = fhp->fh_export;
>  
> +       printk("nfsd write inode=%ld name=%s\n", inode->i_ino, dentry->d_name.name);
> +
>         /*
>          * Request sync writes if
>          *  -   the sync export option has been set, or
> 
> diff -Nru begin//include/linux/fsnotify.h linux-3.0/include/linux/fsnotify.h
> --- begin//include/linux/fsnotify.h     2011-07-22 04:17:23.000000000 +0200
> +++ linux-3.0/include/linux/fsnotify.h  2011-07-30 03:07:00.330239062 +0200
> @@ -216,8 +232,15 @@
>                 mask |= FS_ISDIR;
>  
>         if (!(file->f_mode & FMODE_NONOTIFY)) {
> +               if( !strcmp(path->mnt->mnt_mountpoint->d_name.name, "data") )
> +                       printk("fsnotify modify inode=%ld name=%s\n", inode->i_ino, file->f_dentry->d_name.name);
>                 fsnotify_parent(path, NULL, mask);
>                 fsnotify(inode, mask, path, FSNOTIFY_EVENT_PATH, NULL, 0);
> +       } else {
> +               if( !strcmp(path->mnt->mnt_mountpoint->d_name.name, "data") )
> +                       printk("fsnotify modify-nonotify inode=%ld name=%s\n", inode->i_ino, file->f_dentry->d_name.name);
>         }
>  }
> 
> 
> On the NFS client, open a fd and send some data:
> 
> # exec 1> test
> # ls -la
> # 
> 
> On the NFS server, check the kern log:
> 
> Aug 20 00:57:44 inotifydebug kernel: nfsd write inode=13 name=test
> Aug 20 00:57:44 inotifydebug kernel: fsnotify modify inode=13 name=test
> 
> Everything goes well.
> 
> Now, clear the VFS cache on the NFS server:
> 
> # echo 3 > /proc/sys/vm/drop_caches 
> 
> On the NFS client, send some data to the fd:
> 
> # ls -la
> # 
> 
> On the NFS server, check the kern log:
> 
> Aug 20 00:58:56 inotifydebug kernel: nfsd write inode=13 name=
> Aug 20 00:58:56 inotifydebug kernel: fsnotify modify inode=13 name=
> 
> The filename is lost, fsnotify does not know the filename anymore, 
> therefore inotify cannot send event about a modified file in a watched 
> directory.
> 
> End of the story.
> 
> I guess this is almost impossible to fix this fsnotify bug, this is due 
> by the fact that NFS use inode as file identifiers, so in some case this 
> is impossible to know the modified filepath, and therefore impossible to 
> match the file event to the directory watch.

Oh dear, that's a security hole, if something is using inotify/dnotify
to watch and assumes that file contents (on the same machine,
i.e. server in this case) do not change if there's no event received.

It also breaks cache applications which make the same assumption.  Is
a solution to open inotify watches on every file individually?  If so
that seems quite severe.

I do quite like the idea of using it to break past fanotify security
restrictions though ;-)

Can it also be bypassed with sys_open_by_handle_at?


Possible solution:

One way to look at this as like NFS having a secret hard link to the
file, which does not show up in st_nlink.

Hard links are already a bit tricky with fsnotify and directory
watches.  You can monitor a directory, but a file in it can change
contents through another path.

However, you can track changes of hard-linked files accurately by
either putting a watch directly on all files whose st_nlink >= 2,
and/or making sure you have watches on enough distinct directories
that they contain st_nlink entries for the same file between them,
because at least one of those directories will get an event.  This is
quite practical: You watch the files directly, until such time as you
have found all its links (if you ever do), then you can remove the
direct file watches.

That gives me an idea to help with the NFS no-name watching:

It looks like when a file is referenced by inode without a path, the
problem is there's no path, so no directory inode to receive the
event?

Then this can be solved, in principle (if there's no better way), by
watching a "virtual directory" that gets all events for when the
access doesn't have a parent directory.  There needs to be some way to
watch it, and some way to get the appropriate file from the event (as
there is no real directory.  Or maybe there could be a virtual
filesystem (like /proc, /sys etc.) containing a magic directory that
receives these inode-only events, such that lookups in that directory
yield the affected file.  Exactly as if the directory contains a hard
link to every file, perhaps a text encoding of the handles passed
through sys_open_by_handle_at.

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux