On Mon 17-06-13 12:21:34, azurIt wrote: > >Here we go. I hope I didn't screw anything (Johannes might double check) > >because there were quite some changes in the area since 3.2. Nothing > >earth shattering though. Please note that I have only compile tested > >this. Also make sure you remove the previous patches you have from me. > > > Hi Michal, > > it, unfortunately, didn't work. Everything was working fine but > original problem is still occuring. This would be more than surprising because tasks blocked at memcg OOM don't hold any locks anymore. Maybe I have messed something up during backport but I cannot spot anything. > I'm unable to send you stacks or more info because problem is taking > down the whole server for some time now (don't know what exactly > caused it to start happening, maybe newer versions of 3.2.x). So you are not testing with the same kernel with just the old patch replaced by the new one? > But i'm sure of one thing - when problem occurs, nothing is able to > access hard drives (every process which tries it is freezed until > problem is resolved or server is rebooted). I would be really interesting to see what those tasks are blocked on. > Problem is fixed after killing processes from cgroup which > caused it and everything immediatelly starts to work normally. I > find this out by keeping terminal opened from another server to one > where my problem is occuring quite often and running several apps > there (htop, iotop, etc.). When problem occurs, all apps which wasn't > working with HDD was ok. The htop proved to be very usefull here > because it's only reading proc filesystem and is also able to send > KILL signals - i was able to resolve the problem with it > without rebooting the server. sysrq+t will give you the list of all tasks and their traces. > I created a special daemon (about month ago) which is able to detect > and fix the problem so i'm not having server outages now. The point > was to NOT access anything which is stored on HDDs, the daemon is > only reading info from cgroup filesystem and sending KILL signals to > processes. Maybe i should be able to also read stack files before > killing, i will try it. > > Btw, which vanilla kernel includes this patch? None yet. But I hope it will be merged to 3.11 and backported to the stable trees. > Thank you and everyone involved very much for time and help. > > azur -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>