I'm seeing "systemctl stop <service>" for several services taking a long time because it goes through the timeout process, even though all relevant processes have exited. I'll give 2 examples. Both examples are running inside a privileged Rocky Linux 8.8 Docker container on a Rocky Linux 8.8 host. The systemd version, reported by "systemctl --version" in the container is: systemd 239 (239-74.el8_8.5) Here is ctdb.system: [Unit] Description=CTDB Documentation=man:ctdbd(1) man:ctdb(7) After=network-online.target time-sync.target ConditionFileNotEmpty=/etc/ctdb/nodes [Service] Type=forking LimitCORE=infinity LimitNOFILE=1048576 TasksMax=4096 PIDFile=/var/run/ctdb/ctdbd.pid ExecStart=/usr/sbin/ctdbd ExecStop=/usr/bin/ctdb shutdown KillMode=control-group Restart=no [Install] WantedBy=multi-user.target "/usr/bin/ctdb shutdown" causes a controlled shutdown. In many cases, starting and then stopping using systemctl works fine. However, many times it takes >90s to stop, as per TimeoutStopSec. If I reduce that value then the duration reduces accordingly. I can confirm using both "ps auxfww" and "systemd-cgls" that within the container there are no relevant processes a moment after "systemctl stop ctdb" is run. In particular, in systemd-cgls ctdb.service is gone but "systemctl stop ctdb" is still waiting. Before attempting to stop, the service is successfully started: Oct 11 00:56:44 rocky1 systemd[710741]: ctdb.service: Executing: /usr/sbin/ctdbd Oct 11 00:56:44 rocky1 ctdbd[710741]: CTDB logging to location file:/var/log/log.ctdb Oct 11 00:56:44 rocky1 systemd[1]: Received SIGCHLD from PID 710741 (ctdbd). Oct 11 00:56:44 rocky1 systemd[1]: Child 710741 (ctdbd) died (code=exited, status=0/SUCCESS) Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Child 710741 belongs to ctdb.service. Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Control process exited, code=exited status=0 Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Got final SIGCHLD for state start. Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: New main PID 710742 belongs to service, we are happy. Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Main PID loaded: 710742 Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Changed start -> running Oct 11 00:56:44 rocky1 systemd[1]: ctdb.service: Job ctdb.service/start finished, result=done Oct 11 00:56:44 rocky1 systemd[1]: Started CTDB. -- Subject: Unit ctdb.service has finished start-up -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit ctdb.service has finished starting up. -- -- The start-up result is done. The relevant part of the log while stopping seems to be: Oct 11 00:56:47 rocky1 systemd[1]: Received SIGCHLD from PID 710743 (ctdb-eventd). Oct 11 00:56:47 rocky1 systemd[1]: Child 710742 (ctdbd) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710742 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Can't open PID file /var/run/ctdb/ctdbd.pid (yet?) after stop: No such file or directory Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Main process exited, code=exited, status=0/SUCCESS Oct 11 00:56:47 rocky1 systemd[1]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/ctdb_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=54 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a Oct 11 00:56:47 rocky1 systemd[1]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/ctdb_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=55 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a Oct 11 00:56:47 rocky1 systemd[1]: Child 710743 (ctdb-eventd) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710743 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: systemd-journald.service: Received EPOLLHUP on stored fd 18 (stored), closing. Oct 11 00:56:47 rocky1 systemd[1]: Received SIGCHLD from PID 710860 (ctdb). Oct 11 00:56:47 rocky1 systemd[1]: Child 710860 (ctdb) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710860 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Control process exited, code=exited status=0 Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Got final SIGCHLD for state stop. Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Changed stop -> stop-sigterm Oct 11 00:56:47 rocky1 systemd[1]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/ctdb_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=56 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a Oct 11 00:56:47 rocky1 systemd[1]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/ctdb_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=57 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a Oct 11 00:56:47 rocky1 systemd[1]: Received SIGCHLD from PID 710834 (ctdb_lock_helpe). Oct 11 00:56:47 rocky1 systemd[1]: Child 710834 (ctdb_lock_helpe) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710834 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: Received SIGCHLD from PID 710835 (ctdb_lock_helpe). Oct 11 00:56:47 rocky1 systemd[1]: Child 710832 (ctdb_lock_helpe) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710832 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: Child 710835 (ctdb_lock_helpe) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710835 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: Child 710833 (ctdb_lock_helpe) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710833 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: Child 710836 (ctdb_lock_helpe) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710836 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: Child 710837 (ctdb_lock_helpe) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710837 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: Child 710838 (ctdb_lock_helpe) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710838 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: Child 710839 (ctdb_lock_helpe) died (code=exited, status=0/SUCCESS) Oct 11 00:56:47 rocky1 systemd[1]: ctdb.service: Child 710839 belongs to ctdb.service. Oct 11 00:56:47 rocky1 systemd[1]: Received SIGCHLD from PID 710832 (n/a). Oct 11 00:57:44 rocky1 systemd[1]: systemd-journald.service: Got notification message from PID 840 (WATCHDOG=1) Oct 11 00:58:17 rocky1 systemd[1]: ctdb.service: State 'stop-sigterm' timed out. Killing. Oct 11 00:58:17 rocky1 systemd[1]: ctdb.service: Failed with result 'timeout'. -- Subject: Unit failed -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- The unit ctdb.service has entered the 'failed' state with result 'timeout'. Oct 11 00:58:17 rocky1 systemd[1]: ctdb.service: Changed stop-sigterm -> failed Oct 11 00:58:17 rocky1 systemd[1]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=58 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a Oct 11 00:58:17 rocky1 systemd[1]: ctdb.service: Job ctdb.service/stop finished, result=done Oct 11 00:58:17 rocky1 systemd[1]: Stopped CTDB. -- Subject: Unit ctdb.service has finished shutting down -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit ctdb.service has finished shutting down. Oct 11 00:58:17 rocky1 systemd[1]: Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=59 reply_cookie=0 signature=uoss error-name=n/a error-message=n/a Oct 11 00:58:17 rocky1 systemd[1]: ctdb.service: Unit entered failed state. It would be very useful if systemd could log what it is still waiting for when it times out. Note that during start and stop, CTDB runs a lot of subprocesses, including some that use systemctl to start and stop various services that it, in turn, manages. The full debug level log, after running: systemd-analyze log-level debug is uploaded it to: https://meltin.net/uploads/systemd/ctdb-stop.log I'm happy to reply and attach it, but it is 48KB. The only theory I can come up with is some sort of race where processes are created during shutdown and systemd gets confused. I see a similar thing for a much simpler service, winbind: Here is winbind.service: [Unit] Description=Samba Winbind Daemon Documentation=man:winbindd(8) man:samba(7) man:smb.conf(5) After=network.target nmb.service [Service] Type=notify PIDFile=/var/run/winbindd.pid EnvironmentFile=-/etc/sysconfig/samba ExecStart=/usr/sbin/winbindd --foreground --no-process-group $WINBINDOPTIONS ExecReload=/bin/kill -HUP $MAINPID LimitCORE=infinity [Install] WantedBy=multi-user.target Yesterday I watched it do the same thing as CTDB. I could start the service by hand but it would time out during stop, nearly every time, even though there were no relevant processes running anymore. winbindd sends a READY=1 notification after successfully starting. It does not send STOPPING=1. winbindd is much simpler during shutdown. I can get logs for this one too if necessary. Thanks for any help. peace & happiness, martin