On Tue, Jun 19, 2018 at 12:20 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > On Tue, Jun 19, 2018 at 8:13 AM, Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > A lot of memory can be consumed by the events generated for the huge or > > unlimited queues if there is either no or slow listener. This can cause > > system level memory pressure or OOMs. So, it's better to account the > > fsnotify kmem caches to the memcg of the listener. > > > > There are seven fsnotify kmem caches and among them allocations from > > dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and > > inotify_inode_mark_cachep happens in the context of syscall from the > > listener. So, SLAB_ACCOUNT is enough for these caches. > > > > The objects from fsnotify_mark_connector_cachep are not accounted as they > > are small compared to the notification mark or events and it is unclear > > whom to account connector to since it is shared by all events attached to > > the inode. > > > > The allocations from the event caches happen in the context of the event > > producer. For such caches we will need to remote charge the allocations > > to the listener's memcg. Thus we save the memcg reference in the > > fsnotify_group structure of the listener. > > > > This patch has also moved the members of fsnotify_group to keep the size > > same, at least for 64 bit build, even with additional member by filling > > the holes. > > > > Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx> > > Acked-by: Jan Kara <jack@xxxxxxx> > > Cc: Michal Hocko <mhocko@xxxxxxxxxx> > > Cc: Amir Goldstein <amir73il@xxxxxxxxx> > > Cc: Christoph Lameter <cl@xxxxxxxxx> > > Cc: Pekka Enberg <penberg@xxxxxxxxxx> > > Cc: David Rientjes <rientjes@xxxxxxxxxx> > > Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > > Cc: Greg Thelen <gthelen@xxxxxxxxxx> > > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > > Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> > > Cc: Mel Gorman <mgorman@xxxxxxx> > > Cc: Vlastimil Babka <vbabka@xxxxxxx> > > Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > --- > > Changelog since v5: > > - None > > > > Changelog since v4: > > - Fixed the build for CONFIG_MEMCG=n > > > > Changelog since v3: > > - Rebased over Jan's patches. > > - Some cleanup based on Amir's comments. > > > > Changelog since v2: > > - None > > > > Changelog since v1: > > - no more charging fsnotify_mark_connector objects > > - Fixed the build for SLOB > > > > fs/notify/dnotify/dnotify.c | 5 +++-- > > fs/notify/fanotify/fanotify.c | 6 ++++-- > > fs/notify/fanotify/fanotify_user.c | 5 ++++- > > fs/notify/group.c | 6 ++++++ > > fs/notify/inotify/inotify_fsnotify.c | 2 +- > > fs/notify/inotify/inotify_user.c | 5 ++++- > > include/linux/fsnotify_backend.h | 12 ++++++++---- > > include/linux/memcontrol.h | 7 +++++++ > > mm/memcontrol.c | 15 +++++++++++++-- > > 9 files changed, 50 insertions(+), 13 deletions(-) > > > > diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c > > index e2bea2ac5dfb..a6365e6bc047 100644 > > --- a/fs/notify/dnotify/dnotify.c > > +++ b/fs/notify/dnotify/dnotify.c > > @@ -384,8 +384,9 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg) > > > > static int __init dnotify_init(void) > > { > > - dnotify_struct_cache = KMEM_CACHE(dnotify_struct, SLAB_PANIC); > > - dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC); > > + dnotify_struct_cache = KMEM_CACHE(dnotify_struct, > > + SLAB_PANIC|SLAB_ACCOUNT); > > + dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC|SLAB_ACCOUNT); > > > > dnotify_group = fsnotify_alloc_group(&dnotify_fsnotify_ops); > > if (IS_ERR(dnotify_group)) > > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c > > index f90842efea13..c8d6e37a4855 100644 > > --- a/fs/notify/fanotify/fanotify.c > > +++ b/fs/notify/fanotify/fanotify.c > > @@ -154,14 +154,16 @@ struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group, > > if (fanotify_is_perm_event(mask)) { > > struct fanotify_perm_event_info *pevent; > > > > - pevent = kmem_cache_alloc(fanotify_perm_event_cachep, gfp); > > + pevent = kmem_cache_alloc_memcg(fanotify_perm_event_cachep, gfp, > > + group->memcg); > > if (!pevent) > > return NULL; > > event = &pevent->fae; > > pevent->response = 0; > > goto init; > > } > > - event = kmem_cache_alloc(fanotify_event_cachep, gfp); > > + event = kmem_cache_alloc_memcg(fanotify_event_cachep, gfp, > > + group->memcg); > > if (!event) > > return NULL; > > init: __maybe_unused > > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c > > index ec4d8c59d0e3..0cf45041dc32 100644 > > --- a/fs/notify/fanotify/fanotify_user.c > > +++ b/fs/notify/fanotify/fanotify_user.c > > @@ -16,6 +16,7 @@ > > #include <linux/uaccess.h> > > #include <linux/compat.h> > > #include <linux/sched/signal.h> > > +#include <linux/memcontrol.h> > > > > #include <asm/ioctls.h> > > > > @@ -756,6 +757,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) > > > > group->fanotify_data.user = user; > > atomic_inc(&user->fanotify_listeners); > > + group->memcg = get_mem_cgroup_from_mm(current->mm); > > > > It seem to me that you should export a wrapper to modules with > the mem_cgroup_ prefix and not sure that need to pass current->mm > to this wrapper. > Thanks, I will test with fsnotify as module. > > oevent = fanotify_alloc_event(group, NULL, FS_Q_OVERFLOW, NULL); > > if (unlikely(!oevent)) { > > @@ -957,7 +959,8 @@ COMPAT_SYSCALL_DEFINE6(fanotify_mark, > > */ > > static int __init fanotify_user_setup(void) > > { > > - fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, SLAB_PANIC); > > + fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, > > + SLAB_PANIC|SLAB_ACCOUNT); > > fanotify_event_cachep = KMEM_CACHE(fanotify_event_info, SLAB_PANIC); > > if (IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS)) { > > fanotify_perm_event_cachep = > > diff --git a/fs/notify/group.c b/fs/notify/group.c > > index aa5468f23e45..300fc0f62115 100644 > > --- a/fs/notify/group.c > > +++ b/fs/notify/group.c > > @@ -22,6 +22,7 @@ > > #include <linux/srcu.h> > > #include <linux/rculist.h> > > #include <linux/wait.h> > > +#include <linux/memcontrol.h> > > > > #include <linux/fsnotify_backend.h> > > #include "fsnotify.h" > > @@ -36,6 +37,11 @@ static void fsnotify_final_destroy_group(struct fsnotify_group *group) > > if (group->ops->free_group_priv) > > group->ops->free_group_priv(group); > > > > +#ifdef CONFIG_MEMCG > > + if (group->memcg) > > + css_put(&group->memcg->css); > > +#endif > > + > > This looks very much like an internal implementation detail that has no > business in an external module. I see evidence that you have used > mem_cgroup_put() in the patch at some point and I agree that you > need to export a pair of matching helpers to external modules. > Do not worry. Andrew will change this to mem_cgroup_put(). Basically mem_cgroup_put() was defined by still-in-review patch series by Roman. There were build failure reports where someone was taking this series either without Roman's series or applying series out of order. thanks, Shakeel