On Thu, Nov 16, 2017 at 7:09 PM, Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > Currently the default tmpfs size is totalram_pages / 2 if mount tmpfs > without "-o size=XXX". > When we mount tmpfs in a container(i.e. docker), it is also > totalram_pages / 2 regardless of the memory limit on this container. > That may easily cause OOM if tmpfs occupied too much memory when swap is > off. > So when we mount tmpfs in a memcg, the default size should be limited by > the memcg memory.limit. > The pages of the tmpfs files are charged to the memcg of allocators which can be in memcg different from the memcg in which the mount operation happened. So, tying the size of a tmpfs mount where it was mounted does not make much sense. Also mount operation which requires CAP_SYS_ADMIN, is usually performed by node controller (or job loader) which don't necessarily run in the memcg of the actual job. > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> > --- > include/linux/memcontrol.h | 1 + > mm/memcontrol.c | 2 +- > mm/shmem.c | 20 +++++++++++++++++++- > 3 files changed, 21 insertions(+), 2 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 69966c4..79c6709 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -265,6 +265,7 @@ struct mem_cgroup { > /* WARNING: nodeinfo must be the last member here */ > }; > > +extern struct mutex memcg_limit_mutex; > extern struct mem_cgroup *root_mem_cgroup; > > static inline bool mem_cgroup_disabled(void) > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 661f046..ad32f3c 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2464,7 +2464,7 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry, > } > #endif > > -static DEFINE_MUTEX(memcg_limit_mutex); > +DEFINE_MUTEX(memcg_limit_mutex); This mutex is only needed for updating the limit. > > static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, > unsigned long limit) > diff --git a/mm/shmem.c b/mm/shmem.c > index 07a1d22..1c320dd 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -35,6 +35,7 @@ > #include <linux/uio.h> > #include <linux/khugepaged.h> > #include <linux/hugetlb.h> > +#include <linux/memcontrol.h> > > #include <asm/tlbflush.h> /* for arch/microblaze update_mmu_cache() */ > > @@ -108,7 +109,24 @@ struct shmem_falloc { > #ifdef CONFIG_TMPFS > static unsigned long shmem_default_max_blocks(void) > { > - return totalram_pages / 2; > + unsigned long size; > + > +#ifdef CONFIG_MEMCG > + struct mem_cgroup *memcg = mem_cgroup_from_task(current); > + > + if (memcg == NULL || memcg == root_mem_cgroup) > + size = totalram_pages / 2; > + else { > + mutex_lock(&memcg_limit_mutex); > + size = memcg->memory.limit > totalram_pages ? > + totalram_pages / 2 : memcg->memory.limit / 2; > + mutex_unlock(&memcg_limit_mutex); > + } > +#else > + size = totalram_pages / 2; > +#endif > + > + return size; > } > > static unsigned long shmem_default_max_inodes(void) > -- > 1.8.3.1 > > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html