On So, 04.02.24 00:06, David Timber (dxdt@xxxxxxxxxxxx) wrote: > Systemd crashed on me the other day. I was writing up some Systemd units and > testing them out by daemon-reload every time I wanted to test them out. Not > the best way to go on about, I know. My bad abusing Systemd to the point of > crashing. Perhaps it was just a bit flip that caused this. > > systemd[2368]: Assertion 'path_is_absolute(p)' failed at > src/basic/chase.c:628, function chase(). Aborting. > systemd[1]: Assertion 'path_is_absolute(p)' failed at > src/basic/chase.c:628, function chase(). Aborting. > systemd[1]: Caught <ABRT> from our own process. > systemd-coredump[32497]: Due to PID 1 having crashed coredump > collection will now be turned off. > systemd-coredump[32497]: [🡕] Process 32496 (systemd) of user 0 > dumped core. > systemd[1]: Caught <ABRT>, dumped core as pid 32496. > systemd[1]: Freezing execution. > > ... > > systemd-journald[871]: Failed to send stream file descriptor to > service manager: Transport endpoint is not connected > > I didn't even bother trying producing stack trace. I can get on that if > anyone wants it. My machine started doing some weird things like > Firefox not If this is a current systemd version (v255), please generate a stack trace and submit it as github issue to us, we'll look into it. If it's older, please report to your distro first. > being able to do Ajax properly whilst being able to go to a new page, > Chromium not being able to create a new tab whilst all the text editors > worked just fine, all the systemctl commands timing out. So basically, I was > using Linux without fork(). Anyway. > Well, I think any software can crash for any reason whatsoever. The > problem Yeah, an assert like the above is an error we need to fix in systemd. > with Systemd I realised from this incident is that I had no way of knowing > that Systemd had crashed until I opened up the journal and kernel logs and > saw that Systemd had crashed some time ago. In this particular incident, > Systemd caught the signal and decided to just freeze. No idea why you'd want > that because if it had just crashed, the kernel would have just panicked and > I would have realised something went wrong. > > 1: So I decided that I need a some sort of "watchdog" that warns me when > something like this happens. Using dbus to poll the status of the Systemd > process, it could be a GUI app running under a seat, just a daemon that > writes a warning message using `wall` or just send mail using a primed up > MUA process. I wonder if someone already had the same idea and went on to > make one. you can just use the usual hw watchdog. If pid1 dies it will not ping the hw watchdog, and thus a reset is triggered automatically. In fact we actually configure the hw watchdog by default these days on hw that has it (which are most PCs). > 2: How do I get Systemd to freeze to test such program? I mean, if I kill > Systemd, the kernel would crash so I have to somehow tell Systemd to freeze? Not really, the kernel blocks SIGSTOP for PID1. Lennart -- Lennart Poettering, Berlin