mpm_worker main child thread stuck in ap_mpm_pod_check/read() after reaching MaxRequestsPerChild

Jiří Farták <jfartak@xxxxxx> · Wed, 12 Dec 2018 15:58:09 +0000

Hello,

I'm writing about this apparent bug to this mailing list, since it is not relevant to the newest version of Apache 2.4.X (as developers wants for bug reporting), nevertheless to the last version in 2.2.X branch and maybe some other had met the same problem as we did. However, due to similarities in the implementation of thread communication between 2.2 and 2.4 using POD and signal handling, we cannot exclude that this would not occur in the newest 2.4.X versions on our platform too.

Our server handling approx. 3.5mio requests/day suffers from occasionally OOM killer events caused by the Apache processes that did not exited properly after reaching the MaxRequestPerChild limit and thus eating tons of RAM. After short research we find out, that it is caused by "half-dead" lingering apache processes consisting of one thread only ("child main thread" in mpm_worker implementation) waiting indefinitely in syscall read().

Here is the stack obtained by gstack <pid>:

#0  0x00007fbbaaa0f3fd in read () from /lib64/libpthread.so.0
#1  0x0000000000454e30 in ap_mpm_pod_check ()
#2  0x0000000000429e60 in child_main ()
#3  0x0000000000453902 in make_child ()
#4  0x000000000045398b in startup_children ()
#5  0x0000000000454201 in ap_mpm_run ()
#6  0x000000000042a9c0 in main ()

All other worker threads, listener thread, etc. gone, but NOT this one. Thus process resources are still held in memory.
If I understand the communication between child threads well, the listener thread wakes up the main thread (being blocked in ap_mpm_pod_check/read() and waiting for messages from the parent process) when MaxRequestsPerChild limit is reached.
As one can see from the worker.c source code, listener thread tries to notify child main thread via sending SIGTERM:

..(excerpt)
...  ap_close_listeners();
    ap_queue_term(worker_queue);
    dying = 1;
    ap_scoreboard_image->parent[process_slot].quiescing = 1;

    /* wake up the main thread */
       kill(ap_my_pid, SIGTERM);   ----- this does not do the wanted in our case stuff - main thread still stuck in mpm_pod_check():read()

    apr_thread_exit(thd, APR_SUCCESS);
    return NULL;
}

So the  kill(ap_my_pid, SIGTERM) is unable to interrupt read() syscall, that should return with EINTR a thus exit the ap_mpm_pod_check() and jump out of the child_main() function and finally exit.
But this does not happen. It should - since the child main thread is the only one, who has the signal SIGTERM UNBLOCKED and should receive it. Dunno, why is this so.

Maybe this is some bug relevant to the specific gclibc/linux kernel?

We had to apply dirty patch - make the POD IN pipe read end nonblocking and insert a sleep for a while into the loop inside child_main() in order not to hog the CPU:

child_main()
...
  while (1) {
            rv = ap_mpm_pod_check(pod);
            if (rv == AP_NORESTART) {
                /* see if termination was triggered while we slept */
                switch(terminate_mode) {
                case ST_GRACEFUL:
                    rv = AP_GRACEFUL;
                    break;
                case ST_UNGRACEFUL:
                    rv = AP_RESTART;
                    break;
                }
            }
            if (rv == AP_GRACEFUL || rv == AP_RESTART) {
                /* make sure the start thread has finished;
                 * signal_threads() and join_workers depend on that
                 */
                join_start_thread(start_thread_id);
                signal_threads(rv == AP_GRACEFUL ? ST_GRACEFUL : ST_UNGRACEFUL);
                break;
            }
            sleep(1);   //go to sleep for a while - any non-blocked signal can wake up us quickly
        } 

Yes, ugly and dirty, however we needed to recover stable server behavior. When master process sends signal to POD, there could be delay up to one second due to child main thread sleep when it reacts.
After this patch apache runs and works as expected recycling the child processes after MaxRequestsPerChild.

Our MPM configuration:

<IfModule worker.c>
ServerLimit 4
ThreadLimit     256
StartServers         2
MinSpareThreads      128
MaxSpareThreads      384
MaxClients          1024
ThreadsPerChild      256
MaxRequestsPerChild  10000
MaxMemFree 2048
</IfModule>

We are not using any other Apache module except bundled and PHP (libphp5.so).

Kernel version: 3.10.63-1
Glibc: glibc-2.17-4.4.1.x86_64
Apache: 2.2.34, with bundled APR/APR UTIL.

Does anybody have the same experiences or suggestions of what could be wrong?

Jiri Fartak

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx