Re: File monitor problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[cc: linux-api]

On Wed, Dec 11, 2019 at 3:58 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
> On Wed, Dec 11, 2019 at 12:06 PM Jan Kara <jack@xxxxxxx> wrote:
> >
> > On Wed 04-12-19 22:27:31, Amir Goldstein wrote:
> [...]
> > > The way to frame this correctly IMO is that fsnotify events let application
> > > know that "something has changed", without any ordering guaranty
> > > beyond "sometime before the event was read".
> > >
> > > So far, that "something" can be a file (by fd), an inode (by fid),
> > > more specifically a directory inode (by fid) where in an entry has
> > > changed.
> > >
> > > Adding filename info extends that concept to "something has changed
> > > in the namespace at" (by parent fid+name).
> > > All it means is that application should pay attention to that part of
> > > the namespace and perform a lookup to find out what has changed.
> > >
> > > Maybe the way to mitigate wrong assumptions about ordering and
> > > existence of the filename in the namespace is to omit the event type
> > > for "filename events", for example: { FAN_CHANGE, pfid, name }.
> >
> > So this event would effectively mean: In directory pfid, some filename
> > event has happened with name "name" - i.e. "name" was created (could mean
> > also mkdir), deleted, moved. Am I right?
>
> Exactly.
>
> > And the application would then
> > open_by_handle(2) + open_at(2) + fstat(2) the object pointed to by
>
> open_by_handle(2) + fstatat(2) to be exact.
>
> > (pfid, name) pair and copy whatever it finds to the other end (or delete on
> > the other end in case of ENOENT)?
>
> Basically, yes.
> Although a modern sync tool may also keep some local map of
> remote name -> local fid, to detect a local rename and try to perform a
> remote rename.
>
> >
> > After some thought, yes, I think this is difficult to misuse (or infer some
> > false guarantees out of it). As far as I was thinking it also seems good
> > enough to implement more efficient syncing of directories.
>
> Great, so I will work on the patches.
>

Hi Jan,

I have something working.

Patches:
https://github.com/amir73il/linux/commits/fanotify_name

Simple test:
https://github.com/amir73il/ltp/commits/fanotify_name

I will post the patches after I have a working demo, but in the mean while here
is the gist of the API from the commit log in case you or anyone has comments
on the API.

Note that in the new event flavor, event mask is given as input
(e.g. FAN_CREATE) to filter the type of reported events, but
the event types are hidden when event is reported.

Besides the dirent event types, events "on child" (i.e. MODIFY) can also be
reported with name to a directory watcher.

For now, "on child" events cannot be requested for filesystem/mount
watch, but I think we should consider this possibility so I added
a check to return EINVAL if this combination is attempted.

Let me know what you think.

Thanks,
Amir.

commit 91e0af27ac329f279167e74761fb5303ebbc1c08
Author: Amir Goldstein <amir73il@xxxxxxxxx>
Date:   Mon Dec 16 08:39:21 2019 +0200

    fanotify: report name info with FAN_REPORT_FID_NAME

    With init flags FAN_REPORT_FID_NAME, report events with name in variable
    length fanotify_event_info record similar to how fid's are reported.
    When events are reported with name, the reported fid identifies the
    directory and the name follows the fid. The info record type for this
    event info is FAN_EVENT_INFO_TYPE_FID_NAME.

    There are several ways that an application can use this information:

    1. When watching a single directory, the name is always relative to
    the watched directory, so application need to fstatat(2) the name
    relative to the watched directory.

    2. When watching a set of directories, the application could keep a map
    of dirfd for all watched directories and hash the map by fid obtained
    with name_to_handle_at(2).  When getting a name event, the fid in the
    event info could be used to lookup the base dirfd in the map and then
    call fstatat(2) with that dirfd.

    3. When watching a filesystem (FAN_MARK_FILESYSTEM) or a large set of
    directories, the application could use open_by_handle_at(2) with the fid
    in event info to obtain dirfd for the directory where event happened and
    call fstatat(2) with this dirfd.

    The last option scales better for a large number of watched directories.
    The first two options may be available in the future also for non
    privileged fanotify watchers, because open_by_handle_at(2) requires
    the CAP_DAC_READ_SEARCH capability.

    Legacy inotify events are reported with name and event mask (e.g. "foo",
    FAN_CREATE | FAN_ONDIR).  That can lead users to the conclusion that
    there is *currently* an entry "foo" that is a sub-directory, when in fact
    "foo" may be negative or non-dir by the time user gets the event.

    To make it clear that the current state of the named entry is unknown,
    the new fanotify event intentionally hides this information and reports
    only the flag FAN_WITH_NAME in event mask.  This should make it harder
    for users to make wrong assumptions and write buggy applications.

    We reserve the combination of FAN_EVENT_ON_CHILD on a filesystem/mount
    mark and FAN_REPORT_NAME group for future use, so for now this
    combination is invalid.

    Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>

commit 76a509dbc06fd58ec6636484f87896044cd99022
Author: Amir Goldstein <amir73il@xxxxxxxxx>
Date:   Fri Dec 13 11:58:02 2019 +0200

    fanotify: implement basic FAN_REPORT_FID_NAME logic

    Dirent events will be reported in one of two flavors depending on
    fanotify init flags:

    1. Dir fid info + mask that includes the specific event types and
       optional FAN_ONDIR flag.
    2. Dir fid info + name + mask that includes only FAN_WITH_NAME flag.

    To request the second event flavor, user will need to set the
    FAN_REPORT_FID_NAME flags in fanotify_init().

    The first flavor is already supported since kernel v5.1 and is
    intended to be used for watching directories in "batch mode" - user
    is notified when directory is changed and re-scans the directory
    content in response.  This event flavor is stored more compactly in
    event queue, so it is optimal for workloads with frequent directory
    changes (e.g. many files created/deleted).

    The second event flavor is intended to be used for watching large
    directories, where the cost of re-scan of the directory on every change
    is considered too high.  The watcher getting the event with the directory
    fid and entry name is expected to call fstatat(2) to query the content of
    the entry after the change.

    Events "on child" will behave similarly to dirent events, with a small
    difference - the first event flavor without name reports the child fid.
    The second flavor with name info reports the parent fid, because the
    name is relative to the parent directory.

    At the moment, event name info reporting is not implemented, so the
    FAN_REPORT_NAME flag is not yet valid as input to fanotify_init().

    Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux