Re: Caching/buffers become useless after some time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/24/2018 10:11 AM, Marinko Catovic wrote:
>     1. Send the current value of /sys/kernel/mm/transparent_hugepage/defrag
>     2. Unless it's 'defer' or 'never' already, try changing it to 'defer'.
> 
> 
>  /sys/kernel/mm/transparent_hugepage/defrag is
> always defer defer+madvise [madvise] never

Yeah that's the default.

> I *think* I already played around with these values, as far as I
> remember `never`
> almost caused the system to hang, or at least while I switched back to
> madvise.

That would be unexpected for the 'defrag' file, but maybe possible for
'enabled' file where mm structs are put on/removed from a list
system-wide, AFAIK.

> shall I switch it to defer and observe (all hosts are running fine by
> just now) or
> switch to defer while it is in the bad state?

You could do it immediately and see if no problems appear for long
enough, OTOH...

> and when doing this, should improvement be measurable immediately?

I would expect that. It would be a more direct proof that that was the
cause.

> I need to know how long to hold this, before dropping caches becomes
> necessary.

If it keeps oscillating and doesn't start growing, it means it didn't
help. Few minutes should be enough.

>> Ah, checked the trace and it seems to be "php-cgi". Interesting that
>> they use madvise(MADV_HUGEPAGE). Anyway the above still applies.
> 
> you know, that's at least an interesting hint. look at this:
> https://ckon.wordpress.com/2015/09/18/php7-opcache-performance/
> 
> this was experimental there, but a more recent version seems to have it on
> by default, since I need to disable it on request (implies to me that it
> is on by default).
> it is however *disabled* in the runtime configuration (and not in
> effect, I just confirmed that)
> 
> It would be interesting to know whether madvise(MADV_HUGEPAGE) is then
> active
> somewhere else, since it is in the dump as you observed.

The trace points to php-cgi so either disabling it doesn't work, or they
started using the madvise also for other stuff than opcache. But that
doesn't matter, it would be kernel's fault if a program using the
madvise would effectively kill the system like this. Let's just stick
with the global 'defrag'='defer' change and not tweak several things at
once.

> Please note that `killing` php-cgi would not make any difference then,
> since these processes
> are started by request for every user and killed after whatever script
> is finished. this may
> invoke about 10-50 forks, depending on load, (with different system
> users) every second.

Yep.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux