On Sat, Dec 19, 2020 at 12:11 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > Currently the fs sysctl inotify/max_user_instances is used to limit the > number of inotify instances on the system. For systems running multiple > workloads, the per-user namespace sysctl max_inotify_instances can be > used to further partition inotify instances. However there is no easy > way to set a sensible system level max limit on inotify instances and > further partition it between the workloads. It is much easier to charge > the underlying resource (i.e. memory) behind the inotify instances to > the memcg of the workload and let their memory limits limit the number > of inotify instances they can create. Not that I have a problem with this patch, but what problem does it try to solve? Are you concerned of users depleting system memory by creating userns's and allocating 128 * (struct fsnotify_group) at a time? IMO, that is not what max_user_instances was meant to protect against. There are two reasons I can think of to limit user instances: 1. Pre-memgc, user allocation of events is limited to <max_user_instances>*<max_queued_events> 2. Performance penalty. User can place <max_user_instances> watches on the same "hot" directory, that will cause any access to that directory by any task on the system to pay the penalty of traversing <max_user_instances> marks and attempt to queue <max_user_instances> events. That cost, including <max_user_instances> inotify_merge() loops could be significant #1 is not a problem anymore, since you already took care of accounting events to the user's memcg. #2 is not addressed by your patch. > > Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx> > --- > fs/notify/group.c | 14 ++++++++++++-- > fs/notify/inotify/inotify_user.c | 5 +++-- > include/linux/fsnotify_backend.h | 2 ++ > 3 files changed, 17 insertions(+), 4 deletions(-) > > diff --git a/fs/notify/group.c b/fs/notify/group.c > index a4a4b1c64d32..fab3cfdb4d9e 100644 > --- a/fs/notify/group.c > +++ b/fs/notify/group.c > @@ -114,11 +114,12 @@ EXPORT_SYMBOL_GPL(fsnotify_put_group); > /* > * Create a new fsnotify_group and hold a reference for the group returned. > */ > -struct fsnotify_group *fsnotify_alloc_group(const struct fsnotify_ops *ops) > +struct fsnotify_group *fsnotify_alloc_group_gfp(const struct fsnotify_ops *ops, > + gfp_t gfp) > { > struct fsnotify_group *group; > > - group = kzalloc(sizeof(struct fsnotify_group), GFP_KERNEL); > + group = kzalloc(sizeof(struct fsnotify_group), gfp); > if (!group) > return ERR_PTR(-ENOMEM); > > @@ -139,6 +140,15 @@ struct fsnotify_group *fsnotify_alloc_group(const struct fsnotify_ops *ops) > > return group; > } > +EXPORT_SYMBOL_GPL(fsnotify_alloc_group_gfp); > + > +/* > + * Create a new fsnotify_group and hold a reference for the group returned. > + */ > +struct fsnotify_group *fsnotify_alloc_group(const struct fsnotify_ops *ops) > +{ > + return fsnotify_alloc_group_gfp(ops, GFP_KERNEL); > +} > EXPORT_SYMBOL_GPL(fsnotify_alloc_group); > > int fsnotify_fasync(int fd, struct file *file, int on) > diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c > index 59c177011a0f..7cb528c6154c 100644 > --- a/fs/notify/inotify/inotify_user.c > +++ b/fs/notify/inotify/inotify_user.c > @@ -632,11 +632,12 @@ static struct fsnotify_group *inotify_new_group(unsigned int max_events) > struct fsnotify_group *group; > struct inotify_event_info *oevent; > > - group = fsnotify_alloc_group(&inotify_fsnotify_ops); > + group = fsnotify_alloc_group_gfp(&inotify_fsnotify_ops, > + GFP_KERNEL_ACCOUNT); > if (IS_ERR(group)) > return group; > > - oevent = kmalloc(sizeof(struct inotify_event_info), GFP_KERNEL); > + oevent = kmalloc(sizeof(struct inotify_event_info), GFP_KERNEL_ACCOUNT); > if (unlikely(!oevent)) { > fsnotify_destroy_group(group); > return ERR_PTR(-ENOMEM); Any reason why you did not include fanotify in this patch? Thanks, Amir.