On 08/24/2018 10:11 AM, Marinko Catovic wrote: > 1. Send the current value of /sys/kernel/mm/transparent_hugepage/defrag > 2. Unless it's 'defer' or 'never' already, try changing it to 'defer'. > > > /sys/kernel/mm/transparent_hugepage/defrag is > always defer defer+madvise [madvise] never Yeah that's the default. > I *think* I already played around with these values, as far as I > remember `never` > almost caused the system to hang, or at least while I switched back to > madvise. That would be unexpected for the 'defrag' file, but maybe possible for 'enabled' file where mm structs are put on/removed from a list system-wide, AFAIK. > shall I switch it to defer and observe (all hosts are running fine by > just now) or > switch to defer while it is in the bad state? You could do it immediately and see if no problems appear for long enough, OTOH... > and when doing this, should improvement be measurable immediately? I would expect that. It would be a more direct proof that that was the cause. > I need to know how long to hold this, before dropping caches becomes > necessary. If it keeps oscillating and doesn't start growing, it means it didn't help. Few minutes should be enough. >> Ah, checked the trace and it seems to be "php-cgi". Interesting that >> they use madvise(MADV_HUGEPAGE). Anyway the above still applies. > > you know, that's at least an interesting hint. look at this: > https://ckon.wordpress.com/2015/09/18/php7-opcache-performance/ > > this was experimental there, but a more recent version seems to have it on > by default, since I need to disable it on request (implies to me that it > is on by default). > it is however *disabled* in the runtime configuration (and not in > effect, I just confirmed that) > > It would be interesting to know whether madvise(MADV_HUGEPAGE) is then > active > somewhere else, since it is in the dump as you observed. The trace points to php-cgi so either disabling it doesn't work, or they started using the madvise also for other stuff than opcache. But that doesn't matter, it would be kernel's fault if a program using the madvise would effectively kill the system like this. Let's just stick with the global 'defrag'='defer' change and not tweak several things at once. > Please note that `killing` php-cgi would not make any difference then, > since these processes > are started by request for every user and killed after whatever script > is finished. this may > invoke about 10-50 forks, depending on load, (with different system > users) every second. Yep.