Re: Things I wish I'd known about Inotify

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[CC += Al Viro & Linux, since they also discussed the point about
remote filesystems and /proc and /sys here:
http://thread.gmane.org/gmane.linux.file-systems/83641/focus=83713 .]

On 04/03/2014 05:38 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@xxxxxxxxx> writes:
> 
>> (To: == [the set of people I believe know a lot about inotify])
>>
>> Hello all,
>>
>> Lately, I've been studying the inotify API fairly thoroughly and
>> realized that there's a very big gap between knowing what the system
>> calls do versus using them to reliably and efficiently monitor the
>> state of a set of filesystem objects.
>>
>> With that in mind, I've drafted some substantial additions to the
>> inotify(7) man page. I would be very happy if folk on the "To:" list
>> could comment on the text below, since I believe you all have a lot of
>> practical experience with Inotify. (Of course, I also welcome comments
>> from anyone else.) In particular, I would like comments on the
>> accuracy of the various technical points (especially those relating to
>> matching up related IN_MOVED_FROM and IN_MOVED_TO events), as well as
>> pointers on any other pitfalls that the programmers should be wary of
>> that should be added to the page.
> 
> 
> Other pitfalls.
> 
> Inotify only report events that a user space program triggers through
> the filesystem API.  Which means inotify is limited for remote
> filesystems, and filesystems like proc and sys have no monitorable

Good point. I recently got CCed on that very point, but hadn't 
added it to the page. I've added it now. 

Revised text below, after incorporating changes from your comments and those
of Jan Kara.

Cheers,

Michael


   Limitations and caveats
       The inotify API provides no information about the user or process
       that triggered the inotify event.  In  particular,  there  is  no
       easy  way  for a process that is monitoring events via inotify to
       distinguish events that it triggers itself from  those  that  are
       triggered by other processes.

       Inotify  reports  only  events that a user-space program triggers
       through the filesystem API.  As  a  result,  it  does  not  catch
       remote  events  that occur on network filesystems.  (Applications
       must fall back to polling the filesystem to catch  such  events.)
       Furthermore, various virtual filesystems such as /proc, /sys, and
       /dev/pts are not monitorable with inotify.

       The inotify API identifies affected files by filename.   However,
       by  the time an application processes an inotify event, the file‐
       name may already have been deleted or renamed.

       The inotify API identifies events via watch descriptors.   It  is
       the  application's  responsibility  to cache a mapping (if one is
       needed) between watch descriptors and pathnames.  Be  aware  that
       directory renamings may affect multiple cached pathnames.

       Inotify  monitoring  of  directories is not recursive: to monitor
       subdirectories under a directory, additional watches must be cre‐
       ated.   This  can take a significant amount time for large direc‐
       tory trees.

       If monitoring an entire directory subtree, and a new subdirectory
       is  created in that tree or an existing directory is renamed into
       that tree, be aware that by the time you create a watch  for  the
       new  subdirectory,  new  files  (and  subdirectories) may already
       exist inside the subdirectory.  Therefore, you may want  to  scan
       the  contents  of  the  subdirectory immediately after adding the
       watch (and, if desired, recursively add watches for any subdirec‐
       tories that it contains).

       Note that the event queue can overflow.  In this case, events are
       lost.  Robust applications should handle the possibility of  lost
       events  gracefully.   For example, it may be necessary to rebuild
       part or all of the application cache.  (One simple, but  possibly
       expensive,  approach  is  to  close  the inotify file descriptor,
       empty the cache, create a new inotify file descriptor,  and  then
       re-create  watches  and cache entries for the objects to be moni‐
       tored.)

   Dealing with rename() events
       As noted above, the IN_MOVED_FROM and IN_MOVED_TO event pair that
       is  generated  by  rename(2)  can  be matched up via their shared
       cookie value.  However, the task of matching has some challenges.

       These two events are usually  consecutive  in  the  event  stream
       available  when  reading  from the inotify file descriptor.  How‐
       ever, this is not guaranteed.  If multiple processes are trigger‐
       ing  events  for  monitored  objects, then (on rare occasions) an
       arbitrary  number  of  other  events  may  appear   between   the
       IN_MOVED_FROM and IN_MOVED_TO events.

       Matching  up  the IN_MOVED_FROM and IN_MOVED_TO event pair gener‐
       ated by rename(2) is thus inherently racy.  (Don't forget that if
       an  object is renamed outside of a monitored directory, there may
       not even be an IN_MOVED_TO event.)  Heuristic  approaches  (e.g.,
       assume the events are always consecutive) can be used to ensure a
       match in most cases, but will inevitably miss some cases, causing
       the  application  to  perceive  the IN_MOVED_FROM and IN_MOVED_TO
       events as being unrelated.  If watch  descriptors  are  destroyed
       and  re-created as a result, then those watch descriptors will be
       inconsistent with the watch descriptors in  any  pending  events.
       (Re-creating the inotify file descriptor and rebuilding the cache
       may be useful to deal with this scenario.)

       Applications should also  allow  for  the  possibility  that  the
       IN_MOVED_FROM event was the last event that could fit in the buf‐
       fer returned by the current call to read(2), and the accompanying
       IN_MOVED_TO event might be fetched only on the next read(2).


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux