Am 22.09.2018 um 22:17 schrieb Jerry Martinez:
Hello! Apache has been randomly crashing (for a few months now) and I cannot seem to understand why. I cannot replicate the crash even when hitting the server with 4,000 requests @ a concurrency of 500. This is a production server and I am willing to compensate someone for their efforts resolving this. Below is a sample of one, of the many, error messages: [Fri Sep 21 11:27:24 2018] [mpm_event:alert] (11)Resource temporarily unavailable: AH03104: apr_thread_create: unable to create worker thread
apr_pthread_create() on Linux/Unix is mostly pthread_create(). The man page for that an SLES 12 tells us:
=== SNIP === EAGAIN Insufficient resources to create another thread.EAGAIN A system-imposed limit on the number of threads was encountered. There are a number of limits that may trigger this error: the RLIMIT_NPROC soft resource limit (set via setrlimit(2)), which limits the number of processes and threads for a real user ID, was reached; the kernel's system-wide limit on the number of processes threads, /proc/sys/kernel/threads-max, was reached (see proc(5)); or the maximum number of PIDs, /proc/sys/kernel/pid_max, was reached (see
proc(5)).EAGAIN The system lacked the necessary resources to create another thread, or the system-imposed limit on the total number of threads in a process
{PTHREAD_THREADS_MAX} would be exceeded. === SNIP ===Since your system seems to have lots of free memory, I don't expect a memory shortage unless there's a memory leak and the memory numbers you showed below would be very different when the crash actually happens. Each thread needs a thread stack in memory.
What could happen is that the limit of threads your use can create (sum over all of his processes) hits the nproc limit. Note that although it is called nproc = number of processes, what it limits on Linux is actually the (much bigger) number of threads per user.
Other limits could be total number of threads or processes and number of file descriptors per process.
What is a bit strange though: typically Apache httpd does not start single threads. When it needs more concurrency it starts new processes, each process having ThreadPerChild worker threads. So it seems that due to increased load - or more likely if it is a reverse proxy due to a temporary slowness of the backend - you web server needs to start new processes. The maximum number is in your MPM config.
So even if you find the reason for not being able to create more threads and you can get rid of that, the next thing might be that your httpd will end up with all worker threads busy and you need to find out, why the load is so high or more likely some backend gets slow.
BTW: if you want to get a better idea, what processes and threads get used, to can add %P (process id) and %{tid}p (thread id) to your access log format. And retrieving the number of busy and idle workers from server_status regularly can tell you, when exactly the increase in threads starts and how quickly it goes up.
Regards, Rainer
Below is more information that might be useful:cat /etc/SuSE-releaseSUSE Linux Enterprise Server 12 (x86_64) VERSION = 12 PATCHLEVEL = 2 # This file is deprecated and will be removed in a future service pack or release. # Please check /etc/os-release for details about this release.cat /etc/os-releaseNAME="SLES" VERSION="12-SP2" VERSION_ID="12.2" PRETTY_NAME="SUSE Linux Enterprise Server 12 SP2" ID="sles" ANSI_COLOR="0;32" CPE_NAME="cpe:/o:suse:sles:12:sp2"lscpuArchitecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz Stepping: 2 CPU MHz: 1200.199 CPU max MHz: 3200.0000 CPU min MHz: 1200.0000 BogoMIPS: 4794.82 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 15360K NUMA node0 CPU(s): 0-11 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llcfree -mtotal used free shared buffers cached Mem: 7547 2691 4856 365 6 1965 -/+ buffers/cache: 719 6828 Swap: 2062 0 2062 Apache information Server Version: Apache/2.4.34 (Unix) OpenSSL/1.0.2l Server MPM: event **All MPM event settings are default.**Should I enable some type of core dump settings? I do have the scoreboardenabled (mod_status) and if it helps at all, this is where the error is being triggered: https://github.com/apache/httpd/blob/571b20fb11ae3eb1498b2e279423b2d53eda7e4 b/server/mpm/event/event.c#L2620 Thank you so much in advance! Jerry Martinez
--------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx