On Tue 11-06-13 10:35:01, Piotr Nowojski wrote: > W dniu 07.06.2013 17:36, Michal Hocko pisze: > >On Fri 07-06-13 17:13:55, Piotr Nowojski wrote: > >>W dniu 06.06.2013 17:57, Michal Hocko pisze: > >>>>>In our system we have hit some very annoying situation (bug?) with > >>>>>cgroups. I'm writing to you, because I have found your posts on > >>>>>mailing lists with similar topic. Maybe you could help us or point > >>>>>some direction where to look for/ask. > >>>>> > >>>>>We have system with ~15GB RAM (+2GB SWAP), and we are running ~10 > >>>>>heavy IO processes. Each process is using constantly 200-210MB RAM > >>>>>(RSS) and a lot of page cache. All processes are in cgroup with > >>>>>following limits: > >>>>> > >>>>>/sys/fs/cgroup/taskell2 $ cat memory.limit_in_bytes > >>>>>memory.memsw.limit_in_bytes > >>>>>14183038976 > >>>>>15601344512 > >>>I assume that memory.use_hierarchy is 1, right? > >>System has been rebooted since last test, so I can not guarantee > >>that it was set for 100%, but it should have been. Currently I'm > >>rerunning this scenario that lead to the described problem with: > >> > >>/sys/fs/cgroup/taskell2# cat memory.use_hierarchy ../memory.use_hierarchy > >>1 > >>0 > >OK, good. Your numbers suggeste that the hierachy _is_ in use. I just > >wanted to be 100% sure. > > > > I don't know what has solved this problem, but we weren't able to > reproduce this problem during whole weekend. Most likely there was > some problem with our code initializing cgroups configuration > regarding use_hierarchy (can writing 1 to memory.use_hierarchy > silently fail?). No it complains with EINVAL or EBUSY but maybe you have tripped over bash built-in echo which doesn't return error codes properly AFAIR. Always make sure you use /bin/echo. If you are doing initialization in parallel then this in-deed might race and use_hierarchy fail to set to 1 if any children have been created in the mean time. But again, your numbers suggested that the parent group collected charges from children so this would be rather unexpected. > I have added assertions for checking this parameter before starting > and after initialization of our application. If problem reoccur, I > will proceed as you suggested before - trying latest kernels. > > Thanks, Piotr Nowojski -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>