Am Fr., 2. Nov. 2018 um 15:59 Uhr schrieb Vlastimil Babka <vbabka@xxxxxxx>: > > Forgot to answer this: > > On 10/31/18 3:53 PM, Marinko Catovic wrote: > > Well caching of any operations with find/du is not necessary imho > > anyway, since walking over all these millions of files in that time > > period is really not worth caching at all - if there is a way you > > mentioned to limit the commands there, that would be great. > > Also I want to mention that these operations were in use with 3.x > > kernels as well, for years, with absolutely zero issues. > > Yep, something had to change at some point. Possibly the > reclaim/compaction loop. Probably not the way dentries/inodes are being > cached though. > > > 2 > drop_caches right after that is something I considered, I just had > > some bad experience with this, since I tried it around 5:00 AM in the > > first place to give it enough spare time to finish, since sync; echo 2 > >> drop_caches can take some time, hence my question about lowering the > > limits in mm/vmscan.c, void drop_slab_node(int nid) > > > > I could do this effectively right after find/du at 07:45, just hoping > > that this is finished soon enough - in one worst case it took over 2 > > hours (from 05:00 AM to 07:00 AM), since the host was busy during that > > time with find/du, never having freed enough caches to continue, hence > > Dropping caches while find/du is still running would be > counter-productive. If done after it's already finished, it shouldn't be > so disruptive. > > > my question to let it stop earlier with the modification of > > drop_slab_node ... it was just an idea, nevermind if you believe that > > it was a bad one :) > > Finding a universally "correct" threshold could easily be impossible. I > guess the proper solution would be to drop the while loop and > restructure the shrinking so that it would do a single pass through all > objects. well after a few weeks to make sure, the results seem very promising. There were no issues any more after setting up the cgroup with the limit. This workaround is anyway a good idea to prevent the nightly processed from eating up all the caching/buffers which become useless anyway in the morning, so performance got even better - although the issue is not fixed with that workaround. Since other people will be affected sooner or later as well imho, hopefully you'll figure out a fix soon. Nevertheless I also ran into a new problem there. While writing the PID into the tasks-file (echo $$ > ../tasks) or a direct fputs(getpid(), tasks_fp); works very well, I also had problems with daemons that I wanted to start (e.g. a SQL server) from within that cgroup-controlled binary. This results in the sql server's task kill, since the memory limit is exceeded. I would not like to set the memory.limit_in_bytes to something that huge, such as 30G to make sure, I'd rather just use a wrapper script to handle this, for example: 1) the cgroup-controlled instance starts the wrapper script 2) which excludes itself from the tasks-PID-list (hence the wrapper script it is not controlled any more) 3) it starts or does whatever necessary that should continue normally without the memory restriction Currently I fail to manage this, since I do not know how to do step 2. echo $PID > tasks writes into it and adds the PID, but how would one remove the wrapper script's PID from there? I came up with: cat /cgpath/A/tasks | sed "/$$/d" | cat > /cgpath/A/tasks ..which results in a list without the current PID, however, it fails to write to tasks with cat: write error: Invalid argument, since this is not a regular file.