Re: Fwd: apache2 / httpd graceful/reload failures on Ubuntu 21.04

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Reading the source code:

>From mod_slotmem_shm:
...
401                 apr_shm_remove(fname, pool);
402                 rv = apr_shm_create(&shm, size, fname, gpool);
...
408         ap_log_error(APLOG_MARK, rv == APR_SUCCESS ? APLOG_DEBUG : APLOG_ERR,
409                      rv, ap_server_conf, APLOGNO(02611)
410                      "create: apr_shm_%s(%s) %s",
411                      fbased && is_child_process() ? "attach" : "create",
412                      fname, rv == APR_SUCCESS ? "succeeded" : "failed");


Autoconf defines APR_USE_SHMEM_SHMGET for SHM namebased memory allocation
...
decision on anonymous shared memory allocation method... 4.4BSD-style mmap() via MAP_ANON
decision on namebased memory allocation method... SysV IPC shmget()


in APR, shm.c, apr_shm_create():
380         if ((new_m->shmid = shmget(new_m->shmkey, new_m->realsize,
381                                    SHM_R | SHM_W | IPC_CREAT | IPC_EXCL)) < 0) {
382             apr_file_close(file);
383             return errno;
384         }


>From shmget() manual page, possible errors in errno:
       ENOSPC: All possible shared memory IDs have been taken (SHMMNI), or allocating a segment of the requested size would cause the system to exceed  the  system-wide limit on shared memory (SHMALL).
      
So your "No space left on device" is not on your filesystem but on your shared memory sysv.
 
Since SHMALL on a 64bit linux system is very big (please try this: cat /proc/sys/kernel/shmall), I would bet on a low SHMMNI value on your system (pls: cat /proc/sys/kernel/shmmni)

SHMMNI is the global maximum number of shared memory segments on your system.
Default on Ubuntu should be 4096. Please try to increase the value (echo 8192 > /proc/sys/kernel/shmmni) or more.


HTH.
Ciao, Dino.



27 agosto 2021 11:43, "Spil Oss" <spil.oss@xxxxxxxxx> wrote:

> Hi,
> 
> I've been experiencing a failed apache2 service on Ubuntu 21.04 when
> performing a reload using the `systemctl reload apache2` command. The
> command does not always fail, but seems to be failing more often as
> the number of vhosts increases (currently ca 120). My
> 
> The `systemctl reload apache2` command exits without error, but the
> service ends up in a failed state. Running `systemctl start apache2`
> after this failure starts the service without issues.
> I had taken to do running `systemctl reload apache2; systemctl status
> apache2` to validate that I have a running service, but this would
> report "success" even when the service is "failed".
> 
> Expecting some timing issue, I increased the "RestartSec" systemd
> parameter to 500ms using
> `/etc/systemd/system/apache2.service.d/override.conf`
> [Service]
> RestartSec=500ms
> 
> This has not fixed the issue either.
> 
> Testing the reload using `apachectl -k graceful` can also trigger the
> "failed" state of the process.
> 
> The consistent error is with the persistence of shared memory
> segments. The indicated error is incorrect, there's plenty of space on
> the filesystem. The configuration has been kept as close as possible
> to the default Ubuntu config.
> 
> My gut feeling is some weird interaction between graceful and systemd
> as seen in the logs. The RestartSec change not solving the problem
> kind of goes against that.
> 
> Any help appreciated! Thanks, Bernard Spil.
> 
> $ df -h /var/run/apache2/
> Filesystem Size Used Avail Use% Mounted on
> tmpfs 1.6G 6.6M 1.6G 1% /run
> 
> $ ls mods-enabled/*.load | sed 's/mods-enabled\///;s/\.load//'
> access_compat
> alias
> auth_mellon
> authn_core
> authn_file
> authz_core
> authz_host
> authz_user
> brotli
> deflate
> dir
> env
> filter
> headers
> http2
> lbmethod_byrequests
> mime
> mpm_event
> negotiation
> proxy
> proxy_balancer
> proxy_http
> proxy_http2
> proxy_wstunnel
> remoteip
> reqtimeout
> rewrite
> setenvif
> slotmem_shm
> socache_shmcb
> ssl
> status
> 
> /var/log/apache2/error.log:
> [Fri Aug 27 00:00:18.881934 2021] [mpm_event:notice] [pid 138928:tid
> 140168396681856] AH00493: SIGUSR1 received. Doing graceful restart
> [Fri Aug 27 00:00:19.155640 2021] [slotmem_shm:error] [pid 138928:tid
> 140168396681856] (28)No space left on device: AH02611: create:
> apr_shm_create(/var/run/apache2/slotmem-shm-pd38fd8d0_acc_example_org_6.shm)
> failed
> [Fri Aug 27 00:00:19.155679 2021] [:emerg] [pid 138928:tid
> 140168396681856] AH00020: Configuration Failed, exiting
> [Fri Aug 27 00:05:01.645184 2021] [core:warn] [pid 166984:tid
> 140602886914688] AH00098: pid file /var/run/apache2/apache2.pid
> overwritten -- Unclean shutdown of previous Apache run?
> [Fri Aug 27 00:05:01.656692 2021] [mpm_event:notice] [pid 166984:tid
> 140602886914688] AH00489: Apache/2.4.46 (Ubuntu) OpenSSL/1.1.1j
> configured -- resuming normal operations
> [Fri Aug 27 00:05:01.656748 2021] [core:notice] [pid 166984:tid
> 140602886914688] AH00094: Command line: '/usr/sbin/apache2'
> 
> journalctl:
> Aug 27 02:00:13 web01.example.org systemd[1]: Starting Rotate log files...
> Aug 27 02:00:18 web01.example.org systemd[1]: Reloading The Apache HTTP Server.
> Aug 27 02:00:18 web01.example.org systemd[1]: Reloaded The Apache HTTP Server.
> Aug 27 02:00:18 web01.example.org systemd[1]: logrotate.service: Succeeded.
> Aug 27 02:00:18 web01.example.org systemd[1]: Finished Rotate log files.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Main
> process exited, code=exited, status=1/FAILURE
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164444 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 166489 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164446 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164448 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164450 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164452 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164454 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164456 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164458 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164459 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164460 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164461 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164462 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164463 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164464 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164465 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164466 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164467 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164468 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164469 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164470 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164471 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164472 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164473 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164474 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164475 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164476 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164477 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164478 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164479 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164480 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164481 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164482 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164483 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164484 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164485 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164486 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164487 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164488 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164489 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164490 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164491 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164490 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164491 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164492 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164493 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164494 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164495 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164496 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164497 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164498 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164499 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164500 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164501 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164502 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164503 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164504 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164505 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164506 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164507 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164508 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164509 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164510 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164511 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164512 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164513 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164514 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164515 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164516 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164517 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164518 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164519 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164520 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164521 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164522 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164523 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164524 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164525 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164526 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164527 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164528 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164529 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164530 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164531 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164532 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164533 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164534 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164535 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164536 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164537 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164538 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164539 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164540 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164573 (apache2) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Killing
> process 164606 (n/a) with signal SIGKILL.
> Aug 27 02:00:19 web01.example.org systemd[1]: apache2.service: Failed
> with result 'exit-code'.
> Aug 27 02:05:01 web01.example.org CRON[166975]:
> pam_unix(cron:session): session opened for user root by (uid=0)
> Aug 27 02:05:01 web01.example.org CRON[166976]: (root) CMD
> (/root/bin/apache-workaround)
> Aug 27 02:05:01 web01.example.org systemd[1]: Starting The Apache HTTP Server...
> Aug 27 02:05:01 web01.example.org systemd[1]: Started The Apache HTTP Server.
> 
> When running a vanilla `apachectl -k graceful`:
> Aug 27 10:47:18 web01.sias.intra.lighting.com sudo[176897]: sysadmin:
> TTY=pts/0 ; PWD=/etc/apache2 ; USER=root ; COMMAND=/usr/sbin/apachectl
> -k graceful
> Aug 27 10:47:18 web01.sias.intra.lighting.com sudo[176897]:
> pam_unix(sudo:session): session opened for user root by
> sysadmin(uid=1000)
> Aug 27 10:47:18 web01.sias.intra.lighting.com sudo[176897]:
> pam_unix(sudo:session): session closed for user root
> Aug 27 10:47:19 web01.sias.intra.lighting.com systemd[1]:
> apache2.service: Main process exited, code=exited, status=1/FAILURE
> Aug 27 10:47:19 web01.sias.intra.lighting.com systemd[1]:
> apache2.service: Killing process 173239 (apache2) with signal SIGKILL.
> Aug 27 10:47:19 web01.sias.intra.lighting.com systemd[1]:
> apache2.service: Killing process 174924 (apache2) with signal SIGKILL.
> ...snip...
> Aug 27 10:47:19 web01.sias.intra.lighting.com systemd[1]:
> apache2.service: Killing process 176797 (apache2) with signal SIGKILL.
> Aug 27 10:47:19 web01.sias.intra.lighting.com systemd[1]:
> apache2.service: Killing process 176838 (n/a) with signal SIGKILL.
> Aug 27 10:47:19 web01.sias.intra.lighting.com systemd[1]:
> apache2.service: Failed with result 'exit-code'.
> Aug 27 10:47:20 web01.sias.intra.lighting.com sudo[176903]: sysadmin:
> TTY=pts/0 ; PWD=/etc/apache2 ; USER=root ; COMMAND=/usr/bin/systemctl
> status apache2
> Aug 27 10:47:20 web01.sias.intra.lighting.com sudo[176903]:
> pam_unix(sudo:session): session opened for user root by
> sysadmin(uid=1000)
> Aug 27 10:47:20 web01.sias.intra.lighting.com sudo[176903]:
> pam_unix(sudo:session): session closed for user root
> Aug 27 10:47:34 web01.sias.intra.lighting.com sudo[176906]: sysadmin:
> TTY=pts/0 ; PWD=/etc/apache2 ; USER=root ; COMMAND=/usr/bin/systemctl
> start apache2
> Aug 27 10:47:34 web01.sias.intra.lighting.com sudo[176906]:
> pam_unix(sudo:session): session opened for user root by
> sysadmin(uid=1000)
> Aug 27 10:47:34 web01.sias.intra.lighting.com systemd[1]: Starting The
> Apache HTTP Server...
> Aug 27 10:47:34 web01.sias.intra.lighting.com systemd[1]: Started The
> Apache HTTP Server.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
> For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx





[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux