On Fri, Jun 8, 2018 at 11:34 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Fri, Jun 8, 2018 at 6:27 AM Jan Kara <jack@xxxxxxx> wrote: >> >> The alignment could be fixed by having >> >> struct inode { >> ... >> struct fsnotify_obj i_fsnotify __attribute__ ((aligned(sizeof(void *)))); >> ... >> } >> >> but that's too ugly even for my twisted taste. And before someone asks, >> adding aligned attribute to the definition of fsnotify_obj structure makes >> it grow the padding at the end again. > > Yeah, I don't think there's any way to say "this can't be an array > member, don't pad the size out to the alignment". > > I do wonder if perhaps "struct fsnotify_obj" could actually have a > mask and a 32-bit object ID instead, which would (a) avoid the > alignment issue and (b) actually shrink the structure onm 64-bit. > > Obviously you'd then have to look up the pointer from the object ID > (presumably using a hash, but maybe it would use the radix tree of > idr_alloc() or something). > > I haven't looked at how often those pointers actually get looked up. > If the 'mask' makes it fairly rare, then the extra indirection might > be ok. I suspect it's not. But I thought I'd mention it as a possible > approach. As a matter of fact, the pointer is checked to be non zero before accessing the mask: 7c49b8616460 fs/notify: optimize inotify/fsnotify code for unwatched files So the mask is not really needed to optimize away unwatched inodes. I guess the mask is needed to optimize away use cases like frequent fsnotify_access() calls on inodes that are watched for modification only, but not sure that is worth optimizing. > > The other approach is - as you say - to move 'mask' to be behind the > pointer. But if that causes a lot of extra pointer chasing that mask > would normally avoid, that could be really expensive too. We see that > with the security objects: the cost of looking up whatever is behind > i_security is really nasty, because every inode gets allocated a > private security structure, so it's absolutely horrendous for cache > effects (no sharing). Besides pointer chasing, there is also srcu_read_lock. > > But I suspect that the behavior of i_fsnotify_marks is very different > from i_security, in that most inodes probably don't have marks at all. > No? Yes? So you might not have the nasty case of "almost all inode > access ends up following that pointer and it's horrible in the cache". > Yes and No :-) Yes, most inodes don't have marks at all, but, every fsnotify() call may need to follow mnt->mnt_fsnotify.marks if mount is watched and there is no mnt->mnt_fsnotify.mask in fsnotify_obj. Sure, we don't need to worry about 32bits in struct mount, so we could have a different attach method for inodes and mounts, but the whole purpose of this cleanup is to make the code more generic in handing and attaching of marks to inode/mounts. How about if we move mask into connector, but also leave a shadow mask, in an "inherited" struct i.e.: /* struct to embed in objects, which marks can be attached to */ struct fsnotify_obj { struct fsnotify_mark_connector __rcu *marks; }; /* struct to embed in objects, which marks can be attached to with shadow mask */ struct fsnotify_obj_mask { struct fsnotify_obj obj; /* all events this object cares about */ __u32 mask; }; We embed fsnotify_obj in struct inode and fsnotify_obj_mask in struct mount. The only code that needs to test the shadow mask before following into connector is the optimization code in start of fsnotify() which was not generalized anyway and where the mask optimization is more important for mount watches. The only code that sets the mask is __fsnotify_recalc_mask(), needs to set the shadow mask, if conn->type != FSNOTIFY_OBJ_TYPE_INODE. We already special case FSNOTIFY_OBJ_TYPE_INODE for pinning inode in several places, so there is nothing new here. Thanks, Amir.