Re: Oneshot killed by timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 28, 2025 at 4:42 PM Henti Smith <henti@xxxxxxxxxxxxxxxxx> wrote:
Good day all.

I'm having some timeouts on a oneshot service and I cannot explain the failure based on the documentation.

We have a service that runs a script that checks for a valid upstream NTP server before dependent services can start to

Systemd itself has a "systemd-time-wait-sync.service" for that purpose. It waits for the NTP daemon to set the 'Clock in sync' kernel flag via adjtimex (or, really, *unset* the 'Clock out of sync' flag) so it should be compatible with both ntpd and chrony (with rtcsync on) – and with systemd-timesyncd of course.
 
set PTP master for other slaves on the network to use to set system time. The service file looks like this:

[Unit]
Description=NTP boot-time synch

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/opt/timesync-master/ntpsync_start.sh
TimeoutStartSec=infinity

[Install]
WantedBy=multi-user.target

For the most part this seems to work, but we're seeing failures like this:
root@server:~# systemctl status custom.check.serviceroot@server:~# systemctl status custom.ntpsync.service
* oxbotica.ntpsync.service - NTP boot-time synch
     Loaded: loaded (/lib/systemd/system/custom.ntpsync.service; enabled; vendor preset: enabled)
     Active: failed (Result: timeout) since Mon 2024-06-17 20:31:23 UTC; 4 days ago
    Process: 1695 ExecStart=/opt/timesync-master/ntpsync_start.sh (code=killed, signal=TERM)
   Main PID: 1695 (code=killed, signal=TERM)

This seems to indicate that this was a timeout error and systemd sent TERM to the script, but according to the docs ands and since we don't set TimeoutStartSec and Type=oneshot is used, timeout is disabled by default.

I'm not sure how the process got killed if it's oneshot and timeout is disabled, but the error seems to indicate it was a timeout TERM ?

Run a `systemctl show` on the unit and check what settings are in effect – it might be that a default timeout is set globally in systemd/system.conf or something like that.

I'd also try booting with "systemd.log_level=debug" in the kernel command line, in case it adds anything more useful to the journal whenever this happens.
 

To paint a fuller picture, these computers are ARM without hardware clocks, hence the need for NTP from external source, and on default boot they revert back to epoch and during boot get's set to a more up to date time, which changes with each firmware update from the vendor.

It might very well be systemd itself doing this; on startup it bumps the clock either to its build timestamp or to the timestamp of "/usr/lib/clock-epoch" or "/var/lib/systemd/timesync/clock", whichever is more recent. (The latter file is periodically touched by systemd-timesyncd.)

--
Mantas Mikulėnas

[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux