Fwd: [LSF/MM TOPIC] Filesystem Change Journal API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Change Journal [1] (a.k.a USN) is a popular feature of NTFS v3, used by
backup and indexing applications to monitor changes to a file system
in a reliable, durable and scalable manner.

Linux is lagging behind Windows w.r.t those capabilities by two decades
and it is not because lack of demand for the feature. I dare to make a
wild guess that there are much more file servers nowadays running on
Linux, then there are file servers running on Windows and the scale of
changes to track only increased over the years.

On LSF/MM 2017, I presented "fanotify super block watch" [2], which
addresses the scalability issues of inotify when tracking changes over
millions of directories. This work is running in production now, but is
not yet ready for upstream submission.

This year, I would like to discuss solutions to address the reliability
and durability aspects of Linux filesystem change tracking.

Some Linux filesystems are already journaling everything (e.g. ubifs),
so providing the Change Journal feature to applications is probably just
a matter of providing an API to retrieve latest USN and enumerate changes
within USN range.

Some Linux filesystems store USN-like information in metadata, but it is
not exposed to userspace in a standard way that could be used by change
tracking applications. For example, XFS stores LSN (transaction id) in
inodes, so it should be possible to enumerate inodes that were changed
since a last known queried LSN value.

A more generic approach, for filesystems with no USN-like information,
would be to provide an external change journal facility, much like what
JBD2 does, but not in the block level. This facility could hook as a
consumer of filesystem notifications as an fsnotify backend and provide
record and enumerate capabilities for filesystem operations.

With the external change journal approach, care would have to be taken to
account for the fact that filesystem changes become persistent later than
the time they are reported to fsnotify, so at least a transaction commit
event (with USN) would need to be reported to fsnotify.

The user API to retrieve change journal information should be standard,
whether the change journal is a built in filesystem feature or using the
external change journal. The fanotify API is a good candidate for change
journal API, because it already defines a standard way of reporting
filesystem changes. Naturally, the API would have to be extended to cater
the needs of a change journal API and would require user to explicitly
opt-in for the new API (e.g. FAN_CLASS_CHANGE_JOURNAL).

It is possible (?) that networking filesytems could also make use of a
kernel change journal API to refresh client caches after server reboot in
a more efficient and scalable manner.

[1] https://en.wikipedia.org/wiki/USN_Journal
[2] https://lwn.net/Articles/718802/



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux