> Simon Matter via CentOS wrote: >> >>> We are seeing a problem that occurs ~5% of the time when rebooting >> >> I see such issues on a quite large multi user system but when this >> happens, after forced restarts for kernel updates, I usually don't have >> the time to analyze and play doctor on it. My "solution" now is to >> simply >> reboot the server again in such a case, AKA the systemd way :-) >> >>> CentOS 7.7 where systemd gets a 'Connection timed out' to D-Bus just >>> after the D-Bus service starts - from 'journalctl -x' : >>> >>> ... >>> Jan 21 16:09:59 linux7-7.mpc.local systemd[1]: Started D-Bus System >>> Message Bus. >>> -- Subject: Unit dbus.service has finished start-up >>> -- Defined-By: systemd >>> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel >>> -- >>> -- Unit dbus.service has finished starting up. >>> -- >>> -- The start-up result is done. >>> Jan 21 16:10:24 linux7-7.mpc.local systemd[1]: Failed to register match >>> for Disconnected message: Connection timed out >>> Jan 21 16:10:24 linux7-7.mpc.local systemd[1]: Failed to initialize >>> D-Bus connection: Connection timed out >>> ... >>> >>> This then has a knock-on effect that causes other services to fail - >>> e.g. >>> >>> -- Unit gdm.service has begun starting up. >>> Jan 21 16:10:39 linux7-7.mpc.local dbus[817]: [system] Activating >>> systemd to hand-off: service name='org.freedesktop.login1' >>> unit='dbus-org.freedesktop.login1.service' >>> Jan 21 16:10:50 linux7-7.mpc.local dbus[817]: [system] Failed to >>> activate service 'org.freedesktop.systemd1': timed out >>> Jan 21 16:10:50 linux7-7.mpc.local systemd-logind[1221]: Failed to >>> enable subscription: Failed to activate service >>> 'org.freedesktop.systemd1': timed out >>> Jan 21 16:10:50 linux7-7.mpc.local systemd-logind[1221]: Failed to >>> fully >>> start up daemon: Connection timed out >>> Jan 21 16:10:50 linux7-7.mpc.local systemd[1]: systemd-logind.service: >>> main process exited, code=exited, status=1/FAILURE >>> Jan 21 16:10:50 linux7-7.mpc.local systemd[1]: Failed to start Login >>> Service. >>> -- Subject: Unit systemd-logind.service has failed >>> -- Defined-By: systemd >>> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel >>> -- >>> -- Unit systemd-logind.service has failed. >>> -- >>> -- The result is failed. >>> >>> Whatever the issue is, it appears that polkit might be involved - if we >>> restart the polkit service, things appear to return to normal (e.g. gdm >>> starts up etc) >>> >>> We can't find any similar reports of this happening elsewhere with >>> CentOS 7.7 - but we were wondering if anyone else had come across a >>> problem like this? >> >> I think the root of the problem is that there are missing definitions in >> some of the systemd scripts. They allow things to work in 95% or greater >> of the cases but this happens by chance, not because of perfect process >> handling and system control. Small delays somewhere or uncommon system >> environments then lead to intermittent failures which are difficult to >> diagnose - at least for me. >> >> The good news is that you can just fiddle with the systemd scripts the >> same way we fiddled with init scripts in the past. That way you can try >> and error until you find a solution. Doesn't sound like being in full >> control of things but better than not finding a solution at all. > > Yeah, we found that by introducing a small delay before the ExecStart in > the dbus.service unit - even a delay of just 0.01 seconds (via > 'ExecStartPre=/usr/bin/sleep 0.01') _seems_ to workaround the issue ... Nice that you found at least a workaround. I think I remember that dbus is quite special here because systemd starts it but also depends on it. At least I remember cases where dbus got crazy for whatever reason: the result was that systemd became completely unresponsive and unmanageable and the whole system went down the drain, slowly but steady. Ever tried to shutdown a box if systemd doesn't listen to you anymore? The perfect Windows experience on Linux ;-) > However, we would still like to know what the issue is and get a 'real' > fix - I guess we could try creating a bug report with Redhat ... By bug report you mean BZ or a support request as paying RHEL customer? Unfortunately I'm not too happy anymore with how BZs are handled these days. Am I alone with this feeling? Regards, Simon _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx https://lists.centos.org/mailman/listinfo/centos