Re: [GIT PULL] Fsnotify cleanups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 9, 2018 at 9:57 AM, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> On Fri, Jun 8, 2018 at 11:34 PM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> On Fri, Jun 8, 2018 at 6:27 AM Jan Kara <jack@xxxxxxx> wrote:
>>>
>>> The alignment could be fixed by having
>>>
>>> struct inode {
>>>         ...
>>>         struct fsnotify_obj i_fsnotify __attribute__ ((aligned(sizeof(void *))));
>>>         ...
>>> }
>>>
>>> but that's too ugly even for my twisted taste. And before someone asks,
>>> adding aligned attribute to the definition of fsnotify_obj structure makes
>>> it grow the padding at the end again.
>>
>> Yeah, I don't think there's any way to say "this can't be an array
>> member, don't pad the size out to the alignment".
>>
>> I do wonder if perhaps "struct fsnotify_obj" could actually have a
>> mask and a 32-bit object ID instead, which would (a) avoid the
>> alignment issue and (b) actually shrink the structure onm 64-bit.
>>
>> Obviously you'd then have to look up the pointer from the object ID
>> (presumably using a hash, but maybe it would use the radix tree of
>> idr_alloc() or something).
>>
>> I haven't looked at how often those pointers actually get looked up.
>> If the 'mask' makes it fairly rare, then the extra indirection might
>> be ok. I suspect it's not. But I thought I'd mention it as a possible
>> approach.
>
> As a matter of fact, the pointer is checked to be non zero before
> accessing the mask:
>
> 7c49b8616460 fs/notify: optimize inotify/fsnotify code for unwatched files
>
> So the mask is not really needed to optimize away unwatched inodes.
> I guess the mask is needed to optimize away use cases like frequent
> fsnotify_access() calls on inodes that are watched for modification only,
> but not sure that is worth optimizing.
>
>>
>> The other approach is - as you say - to move 'mask' to be behind the
>> pointer. But if that causes a lot of extra pointer chasing that mask
>> would normally avoid, that could be really expensive too. We see that
>> with the security objects: the cost of looking up whatever is behind
>> i_security is really nasty, because every inode gets allocated a
>> private security structure, so it's absolutely horrendous for cache
>> effects (no sharing).
>
> Besides pointer chasing, there is also srcu_read_lock.
>
>>
>> But I suspect that the behavior of i_fsnotify_marks is very different
>> from i_security, in that most inodes probably don't have marks at all.
>> No? Yes? So you might not have the nasty case of "almost all inode
>> access ends up following that pointer and it's horrible in the cache".
>>
>
> Yes and No :-)
> Yes, most inodes don't have marks at all,
> but, every fsnotify() call may need to follow mnt->mnt_fsnotify.marks
> if mount is watched and there is no mnt->mnt_fsnotify.mask in fsnotify_obj.
>
> Sure, we don't need to worry about 32bits in struct mount, so we could
> have a different attach method for inodes and mounts, but the whole
> purpose of this cleanup is to make the code more generic in handing
> and attaching of marks to inode/mounts.
>
> How about if we move mask into connector, but also leave a shadow mask,
> in an "inherited" struct i.e.:
>
> /* struct to embed in objects, which marks can be attached to */
> struct fsnotify_obj {
>        struct fsnotify_mark_connector __rcu *marks;
> };
>
> /* struct to embed in objects, which marks can be attached to with
> shadow mask */
> struct fsnotify_obj_mask {
>        struct fsnotify_obj obj;
>        /* all events this object cares about */
>        __u32 mask;
> };
>
> We embed fsnotify_obj in struct inode and fsnotify_obj_mask in struct mount.
>
> The only code that needs to test the shadow mask before following into
> connector is the optimization code in start of fsnotify() which was not
> generalized anyway and where the mask optimization is more important
> for mount watches.
>

No. That's not true, there is also fsnotify_inode_watches_children() and
__fsnotify_parent(). But I think the only bits in the mask that really matter
for optimizing inode watches are FS_ACCESS and FS_EVENT_ON_CHILD
and those could be encoded in lower bits of connector pointer.
connector is pointer is always dereferenced via fsnotify_grab_connector(),
so it's easy to mask out those bits.

> The only code that sets the mask is __fsnotify_recalc_mask(), needs to
> set the shadow mask, if conn->type != FSNOTIFY_OBJ_TYPE_INODE.
> We already special case FSNOTIFY_OBJ_TYPE_INODE for pinning inode
> in several places, so there is nothing new here.
>

So instead, __fsnotify_recalc_mask() needs to set the bits on connector
pointer in FSNOTIFY_OBJ_TYPE_INODE case.

I'll try to sketch some patches.

Thanks,
Amir.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux