Re: [RFC 1/4] fs: Add generic file system event notifications

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/17/2015 01:31 PM, Jan Kara wrote:
> On Wed 15-04-15 09:15:44, Beata Michalska wrote:
>> Introduce configurable generic interface for file
>> system-wide event notifications to provide file
>> systems with a common way of reporting any potential
>> issues as they emerge.
>>
>> The notifications are to be issued through generic
>> netlink interface, by a dedicated, for file system
>> events, multicast group. The file systems might as
>> well use this group to send their own custom messages.
>>
>> The events have been split into four base categories:
>> information, warnings, errors and threshold notifications,
>> with some very basic event types like running out of space
>> or file system being remounted as read-only.
>>
>> Threshold notifications have been included to allow
>> triggering an event whenever the amount of free space
>> drops below a certain level - or levels to be more precise
>> as two of them are being supported: the lower and the upper
>> range. The notifications work both ways: once the threshold
>> level has been reached, an event shall be generated whenever
>> the number of available blocks goes up again re-activating
>> the threshold.
>>
>> The interface has been exposed through a vfs. Once mounted,
>> it serves as an entry point for the set-up where one can
>> register for particular file system events.
>>
>> Signed-off-by: Beata Michalska <b.michalska@xxxxxxxxxxx>
>   Thanks for the patches! Some comments are below.
> 
>> ---
>>  Documentation/filesystems/events.txt |  254 +++++++++++
>>  fs/Makefile                          |    1 +
>>  fs/events/Makefile                   |    6 +
>>  fs/events/fs_event.c                 |  775 ++++++++++++++++++++++++++++++++++
>>  fs/events/fs_event.h                 |   27 ++
>>  fs/events/fs_event_netlink.c         |   94 +++++
>>  fs/namespace.c                       |    1 +
>>  include/linux/fs.h                   |    6 +-
>>  include/linux/fs_event.h             |   69 +++
>>  include/uapi/linux/fs_event.h        |   62 +++
>>  include/uapi/linux/genetlink.h       |    1 +
>>  net/netlink/genetlink.c              |    7 +-
>>  12 files changed, 1301 insertions(+), 2 deletions(-)
>>  create mode 100644 Documentation/filesystems/events.txt
>>  create mode 100644 fs/events/Makefile
>>  create mode 100644 fs/events/fs_event.c
>>  create mode 100644 fs/events/fs_event.h
>>  create mode 100644 fs/events/fs_event_netlink.c
>>  create mode 100644 include/linux/fs_event.h
>>  create mode 100644 include/uapi/linux/fs_event.h
>>
>> diff --git a/Documentation/filesystems/events.txt b/Documentation/filesystems/events.txt
>> new file mode 100644
>> index 0000000..c85dd88
>> --- /dev/null
>> +++ b/Documentation/filesystems/events.txt
>> @@ -0,0 +1,254 @@
>> +
>> +	Generic file system event notification interface
>> +
>> +Document created 09 April 2015 by Beata Michalska <b.michalska@xxxxxxxxxxx>
>> +
>> +1. The reason behind:
>> +=====================
>> +
>> +There are many corner cases when things might get messy with the filesystems.
>> +And it is not always obvious what and when went wrong. Sometimes you might
>> +get some subtle hints that there is something going on - but by the time
>> +you realise it, it might be too late as you are already out-of-space
>> +or the filesystem has been remounted as read-only (i.e.). The generic
>> +interface for the filesystem events fills the gap by providing a rather
>> +easy way of real-time notifications triggered whenever something intreseting
>> +happens, allowing filesystems to report events in a common way, as they occur.
>> +
>> +2. How does it work:
>> +====================
>> +
>> +The interface itself has been exposed as fstrace-type Virtual File System,
>> +primarily to ease the process of setting up the configuration for the file
>> +system notifications. So for starters it needs to get mounted (obviously):
>> +
>> +	mount -t fstrace none /sys/fs/events
>> +
>> +This will unveil the single fstrace filesystem entry - the 'config' file,
>> +through which the notification are being set-up.
>> +
>> +Activating notifications for particular filesystem is as straightforward
>> +as writing into the 'config' file. Note that by default all events despite
>> +the actual filesystem type are being disregarded.
>   Is there a reason to have a special filesystem for this? Do you expect
> extending it by (many) more files? Why not just creating a file in sysfs or
> something like that?

No particular reason here - just for possible future extension if needed.
I'm totally fine with having a single sysfs entry.

> 
>> +Synopsis of config:
>> +------------------
>> +
>> +	MOUNT EVENT_TYPE [L1] [L2]
>> +
>> + MOUNT      : the filesystem's mount point
>   I'm not quite decided but is mountpoint really the right thing to pass
> via the interface? They aren't unique (filesystem can be mounted in
> multiple places) and more importantly can change over time. So won't it be
> better to pass major:minor over the interface? These are stable, unique to
> the filesystem, and userspace can easily get them by calling stat(2) on the
> desired path (or directly from /proc/self/mountinfo). That could be also
> used as an fs identifier instead of assigned ID (and thus we won't need
> those events about creation of new trace which look somewhat strange to
> me).
> 
Even if a given filesystem is being mounted in many places this will come
down to single super_block - the interface will add trace for the first mount
point. This is just to ease the usage: internally a particular trace is 
associated with a super_block. 

> OTOH using major:minor may have issues in container world where processes
> could watch events from filesystems inaccessible to the container if they
> guess the device number. So maybe we could use 'path' when creating new
> trace but I'd still like to use the device number internally and for all
> outgoing communication because of above mentioned problems with
> mountpoints.

Alright then, so dropping the idea of announcing new trace (with assigned id)
and switching to using the major:minor numbers. Sounds OK to me.

> 
>> + EVENT_TYPE : type of events to be enabled: info,warn,err,thr;
>> +              at least one type needs to be specified;
>> +              note the comma delimiter and lack of spaces between
>> +	      those options
>> + L1         : the threshold limit - lower range
>> + L2         : the threshold limit - upper range
>> + 	      case enabling threshold notifications the lower level is
>> +	      mandatory, whereas the upper one remains optional;
>> +	      note though, that as those refer to the number of available
>> +	      blocks, the lower level needs to be higher than the upper one
>> +
>> +Sample request could look like the follwoing:
>> +
>> + echo /sample/mount/point warn,err,thr 710000 500000 > /sys/fs/events/config
>> +
>> +Multiple request might be specified provided they are separated with semicolon.
>   Is this necessary? It somewhat complicates syntax and parsing in kernel
> and I don't see a need for that. I'd prefer to keep the interface as simple
> as possible.
> 

This is not necessary but could ease the usage - i.e. through scripting: to specify
multiple traces and register them in one go. 

> Also I think that we should make it clear that each event type has
> different set of arguments. For threshold events they'll be L1 & L2, for
> other events there may be no arguments, for other events maybe something
> else...
> 

Currently only the threshold events use arguments -  not sure what arguments
could be used for the remaining notifications. But any suggestions are welcomed.

> ...
>> +static const match_table_t fs_etypes = {
>> +	{ FS_EVENT_INFO,    "info"  },
>> +	{ FS_EVENT_WARN,    "warn"  },
>> +	{ FS_EVENT_THRESH,  "thr"   },
>> +	{ FS_EVENT_ERR,     "err"   },
>> +	{ 0, NULL },
>> +};
>   Why are there these generic message types? Threshold messages make good
> sense to me. But not so much the rest. If they don't have a clear meaning,
> it will be a mess. So I also agree with a message like - "filesystem has
> trouble, you should probably unmount and run fsck" - that's fine. But
> generic "info" or "warning" doesn't really carry any meaning on its own and
> thus seems pretty useless to me. To explain a bit more, AFAIU this
> shouldn't be a generic logging interface where something like severity
> makes sense but rather a relatively specific interface notifying about
> events in filesystem userspace should know about so I expect relatively low
> number of types of events, not tens or even hundreds...
> 
> 								Honza

Getting rid of those would simplify the configuration part, indeed.
So we would be left with 'generic' and threshold events.
I guess I've overdone this part.

Thanks for Your comments so far.

BR
Beata





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]