On úterý 24. Å?Ãjna 2017 17:10:39 CEST, Jan Kundrát wrote: > Hi, > is it possible to change systemd's global settings for > RuntimeWatchdogSec at runtime? I would like to have the early > boot "guarded" by the HW watchdog started by my platform code, > and for systemd to take over only after a certain target has > been reached. I was thinking about an extra unit which simply > writes an appropriate config file, but the docs for `systemctl > daemon-reload` or `daemon-reexec` do not talk about these > top-level settins. How do I tell systemd to notice a new value? > > Context: I'm using systemd on an embedded ARM box with reliable > network connectivity. The system has two fully separate > rootfs/kernel/devicetree instances, A and B. The bootloader > starts a HW watchdog timer, and the bootloader keeps a counter > tracking of how many times a particular A/B "boot slot" > attempted to boot. The kernel ignores the watchdog, and once > systemd gets launched and checks it system.conf file, it > proceeds to re-start the WD timer periodically. Finally, a unit > which is pulled in by my default target updates the bootloader's > environment, resetting the boot counter. > > My goal is to be able to boot a possibly broken image (but not > a malicious one, of course) without fearing that it's going to > lock me out of my device. If the new image "fails" for some > reason, I epxect the HW watchdog to reset the system, the boot > attempt counter to eventually reach zero, and the whole system > to roll-back to the previous image, eventually. In my scneario, > it's preferred to make the decision to reboot rather than > waiting for human interaction for solving the actual problem. > The once-failed slot can be re-flahed very cheapily, and an > updated version can be re-tried during the next update attempt. > > During my testing, I was able to unplug the system's SD card at > a "wrong" moment which resulted in systemd trying to boot into > emergency.target and ultimately failing due to a missing rootfs. > I ended up with an unusable system which did not reboot > automatically because systemd was periodically pinging the HW > watchdog timer. [1] > > I got a suggestion to adjust the important units so that they > specify a FailureAction. I do not like that solution because it > is additional work (identifying which units might fail, coming > up with various possible failing scenarios, being hard to test > and get "right" in face of systemd updates in future, etc). It > also feels like I am attacking a wrong problem. I already *have* > a watchdog which will shoot the system into the head if > something wrong happens. Wouldn't it make more sense to rely on > this piece of infrastructure and start telling the watchdog > "hey, I'm OK" only after the system has fuly booted and my > ultimate target has been *reached*? > > SUggestions which offer additional possibilities are welcome. I > like system'd feature set, and I won't pretend that I know all > of them :). > > With kind regards, > Jan > > [1] https://github.com/systemd/systemd/issues/7063 I more or less solved this by *not* configuring systemd to start pinging the watchdog on its own. Then I added another unit depending on and being wanted by multi-user.target which checks whether everything is OK so far: [Unit] Description=Pinging the HW watchdog Requires=multi-user.target After=multi-user.target [Service] Type=oneshot ExecStartPre=/bin/sh -c '[ "$(/bin/systemctl list-units --failed --all --no-legend --no-pager)" == "" ]' ExecStart=/bin/busctl set-property org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager RuntimeWatchdogUSec t 30000000 For more details, see the original bugreport at https://github.com/systemd/systemd/issues/7063 . Cheers, Jan