Re: [PATCH 06/25] fsnotify: Attach marks to object via dedicated head structure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 6, 2017 at 9:44 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> On Wed, Feb 1, 2017 at 11:44 AM, Jan Kara <jack@xxxxxxx> wrote:
>> Currently notification marks are attached to object (inode or vfsmnt) by
>> a hlist_head in the object. The list is also protected by a spinlock in
>> the object. So while there is any mark attached to the list of marks,
>> the object must be pinned in memory (and thus e.g. last iput() deleting
>> inode cannot happen). Also for list iteration in fsnotify() to work, we
>> must hold fsnotify_mark_srcu lock so that mark itself and
>> mark->obj_list.next cannot get freed. Thus we are required to wait for
>> response to fanotify events from userspace process with
>> fsnotify_mark_srcu lock held. That causes issues when userspace process
>> is buggy and does not reply to some event - basically the whole
>> notification subsystem gets eventually stuck.
>>
>> So to be able to drop fsnotify_mark_srcu lock while waiting for
>> response, we have to pin the mark in memory and make sure it stays in
>> the object list (as removing the mark waiting for response could lead to
>> lost notification events for groups later in the list). However we don't
>> want inode reclaim to block on such mark as that would lead to system
>> just locking up elsewhere.
>>
>> This commit tries to pave a way towards solving these conflicting
>> lifetime needs. Instead of anchoring the list of marks directly in the
>> object, we anchor it in a dedicated structure (fsnotify_mark_connector)
>> and just point to that structure from the object. In the following patch
>> we will also protect the list by a spinlock contained in that structure.
>> With this, we will be able to detach notification marks from object
>> without having to modify the list itself.
>
> This is still a bit big to review easily.  I'd suggest further splitup into:
>
> - move hlist_head from inode/mnt into connector
> - move inode/mnt ptr from mark into connector (no refcounting changes)
> - inode refcounting cleanup
>
> More comments inline.
>

...

>> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
>> index b41515d3f081..421c7431a16e 100644
>> --- a/fs/notify/fsnotify.c
>> +++ b/fs/notify/fsnotify.c
>> @@ -193,6 +193,7 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
>>         struct hlist_node *inode_node = NULL, *vfsmount_node = NULL;
>>         struct fsnotify_mark *inode_mark = NULL, *vfsmount_mark = NULL;
>>         struct fsnotify_group *inode_group, *vfsmount_group;
>> +       struct fsnotify_mark_connector *inode_conn, *vfsmount_conn;
>>         struct mount *mnt;
>>         int idx, ret = 0;
>>         /* global tests shouldn't care about events on child only the specific event */
>> @@ -210,8 +211,8 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
>>          * SRCU because we have no references to any objects and do not
>>          * need SRCU to keep them "alive".
>>          */
>> -       if (hlist_empty(&to_tell->i_fsnotify_marks) &&
>> -           (!mnt || hlist_empty(&mnt->mnt_fsnotify_marks)))
>> +       if (!to_tell->i_fsnotify_marks &&
>> +           (!mnt || !mnt->mnt_fsnotify_marks))
>
> The checks against hlist_empty() could still be useful to optimize
> away the memory barrier imposed by srcu_read_lock() in case there were
> marks on the object but not anymore (i.e. the connector is non-NULL,
> but the hlist is empty).
>

Is this really a case worth optimizing?
Can an empty connector stay alive more then a few grace periods?

Not that it is hard to check hlist_empty(). I'm just wondering, if this would be
code with no visible gain.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux