Re: fanotify - overall design before I start sending patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jul 24, 2009  16:13 -0400, Eric Paris wrote:
> fanotify kernel/userspace interaction is over a new socket protocol.  A
> listener opens a new socket in the new PF_FANOTIFY family.  The socket
> is then bound to an address.  Using the following struct:

Would it make sense to use existing netlink?

> struct fanotify_addr {
>         sa_family_t family;
>         __u32 priority;
>         __u32 group_num;
>         __u32 mask;
>         __u32 f_flags;
>         __u32 unused[16];
> }  __attribute__((packed));
> 
> The mask is the indication of the events this group is interested in.
> The set of events of interest if FAN_GLOBAL_LISTENER is set at bind
> time.  If FAN_GLOBAL_LISTENER is not set, this field is meaningless as
> the registration of events on individual inodes will dictate the
> reception of events.
> 
> * FAN_ACCESS: every file access.
> * FAN_MODIFY: file modifications.
> * FAN_CLOSE: files are closed.
> * FAN_OPEN: open() calls.
> * FAN_ACCESS_PERM: like FAN_ACCESS, except that the process trying to
> access the file is put on hold while the fanotify client decides whether
> to allow the operation.
> * FAN_OPEN_PERM: like FAN_OPEN, but with the permission check.
> * FAN_EVENT_ON_CHILD: receive notification of events on inodes inside
> this subdirectory. (this is not a full recursive notification of all
> descendants, only direct children)
> * FAN_GLOBAL_LISTENER: notify for events on all files in the system.
> * FAN_SURVIVE_MODIFY: special flag that ignores should survive inode
> modification.  Discussed below.

It seems like a 32-bit mask might not be enough, it wouldn't be hard
at this stage to add a 64-bit mask.  Lustre has a similar mechanism
(changelog) that allows tracking all different kinds of filesystem
events (create/unlink/symlink/link/rename/mkdir/setxattr/etc), instead
of just open/close, also use by HSM, enhanced rsync, etc.

> struct fanotify_event_metadata {
>         __u32 event_len;
>         __s32 fd;
>         __u32 mask;
>         __u32 f_flags;
>         __s32 pid;
>         __s32 tgid;
>         __u64 cookie;
> }  __attribute__((packed));

Getting the attributes that have changed into this message is also
useful, as it avoids a continual stream of "stat" calls on the inodes.

The other thing that is important for HSM is that this log is atomic
and persistent, otherwise there may be files that are missed if the
node crashes.  This involves creating atomic update records as part
of the filesystem operation, and then userspace consumes them and
tells the kernel that it is finished with records up to X.  Otherwise
you risk inconsistencies between rsync/HSM/updatedb for files that
are updated just before a crash.

> If a FAN_ACCESS_PERM or FAN_OPEN_PERM event is received the listener
> must send a response before the 5 second timeout.  If no response is
> sent before the 5 second timeout the original operation is allowed.  If
> this happens too many times (10 in a row) the fanotify group is evicted
> from the kernel and will not get any new events.

This should be a tunable, since if the intent is to monitor PERM checks
it would be possible for users to DOS the machine and delay the userspace
programs and access files they shouldn't be able to.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux