08.03.2018 02:37, Jakob Schürz пиÑ?еÑ?: > Hi there! > > I build a test-unit > > # cat test at .service > [Unit] > Description=Testservice notification > OnFailure=notification-telegram@%n.service > > [Service] > Type=simple > Restart=on-failure > #RestartSec=2 > ExecStart=/bin/%i > SyslogIdentifier=test@%i.service > StartLimitBurst=5 > StartLimitInterval=10 > > > And the notification-Unit notification-telegram@%n.service > > # cat notification-telegram at .service > [Unit] > Description=Send failure-notification about %i to telegram > > [Service] > User=jakob > ExecStart=/bin/bash -c "/usr/local/bin/ntfy -b telegram send > \"FAILED\n$(systemctl status %i)\"" > > When i start the Test-Unit with systemctl start test at false i get 5 > Messages in telegram... > > The log is: > Mär 08 00:31:53 aldebaran systemd[1]: Started Testservice notification. > Mär 08 00:31:53 aldebaran systemd[1]: test at false.service: Main process > exited, code=exited, status=1/FAILURE > Mär 08 00:31:53 aldebaran systemd[1]: test at false.service: Failed with > result 'exit-code'. > Mär 08 00:31:53 aldebaran systemd[1]: test at false.service: Triggering > OnFailure= dependencies. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Service > hold-off time over, scheduling restart. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Scheduled > restart job, restart counter is at 1. > Mär 08 00:31:54 aldebaran systemd[1]: Stopped Testservice notification. > Mär 08 00:31:54 aldebaran systemd[1]: Started Testservice notification. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Main process > exited, code=exited, status=1/FAILURE > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Failed with > result 'exit-code'. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Triggering > OnFailure= dependencies. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Service > hold-off time over, scheduling restart. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Scheduled > restart job, restart counter is at 2. > Mär 08 00:31:54 aldebaran systemd[1]: Stopped Testservice notification. > Mär 08 00:31:54 aldebaran systemd[1]: Started Testservice notification. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Main process > exited, code=exited, status=1/FAILURE > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Failed with > result 'exit-code'. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Triggering > OnFailure= dependencies. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Service > hold-off time over, scheduling restart. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Scheduled > restart job, restart counter is at 3. > Mär 08 00:31:54 aldebaran systemd[1]: Stopped Testservice notification. > Mär 08 00:31:54 aldebaran systemd[1]: Started Testservice notification. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Main process > exited, code=exited, status=1/FAILURE > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Failed with > result 'exit-code'. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Triggering > OnFailure= dependencies. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Service > hold-off time over, scheduling restart. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Scheduled > restart job, restart counter is at 4. > Mär 08 00:31:54 aldebaran systemd[1]: Stopped Testservice notification. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Start request > repeated too quickly. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Failed with > result 'exit-code'. > Mär 08 00:31:54 aldebaran systemd[1]: Failed to start Testservice > notification. > Mär 08 00:31:54 aldebaran systemd[1]: test at false.service: Triggering > OnFailure= dependencies. > > > You see, the Unit from OnFailure= is called 5 times, not at the "Failed > to start Testservice notification"-time. > > The man-page says: > > OnFailure= > A space-separated list of one or more units that are > activated when this unit enters the "failed" state. A service unit using > Restart= enters the failed state only after the > start limits are reached. > This is apparently wrong, because service briefly goes via "failed" state every time it fails. It is true that if Restart= is set it immediately follows by "activating" state again, but OnFailure actions are still taken. So from end-user perspective unit indeed remains "failed" only when limits are reached, but internally it does transition via "failed" state every time. > > But in this testcase, the unit listet in OnFailure is called every time, > the unit failes, restarts again fails again, and after 5 times > (=StartLimitBurst), the unit falls into failed state... Here should be > the only one time, where "OnFailure=" is hit... > > My systemd-Version is 237-3 from debian. > > Should i file a Bug in bugs.freedesktop.org? > You should create issue on github, this this where primary bug tracker is today: https://github.com/systemd/systemd/