Hello Lennart and Michal,
Thank you for your replies. The cgroup file is below - can you please advise what is the relevant part to check?
The problem is most likely with systemd thinking the program is stopped because "systemctl status" reports:
Aug 10 03:57:32 myhost systemd[1]: product_routed.service: Main process exited, code=exited, status=1/FAILURE
Aug 10 03:57:32 myhost systemd[1]: product_routed.service: Failed with result 'exit-code'.
Aug 10 03:57:32 myhost systemd[1]: product_routed.service: Failed with result 'exit-code'.
We will look into that, thank you.
# cat /proc/17824/cgroup
12:memory:/
11:pids:/user.slice/user-0.slice/session-623.scope
10:rdma:/
9:hugetlb:/
8:blkio:/
7:devices:/user.slice
6:cpuset:/
5:net_cls,net_prio:/
4:freezer:/
3:perf_event:/
2:cpu,cpuacct:/user.slice/user-0.slice/session-623.scope
1:name=systemd:/user.slice/user-0.slice/session-623.scope
0::/user.slice/user-0.slice/session-623.scope
12:memory:/
11:pids:/user.slice/user-0.slice/session-623.scope
10:rdma:/
9:hugetlb:/
8:blkio:/
7:devices:/user.slice
6:cpuset:/
5:net_cls,net_prio:/
4:freezer:/
3:perf_event:/
2:cpu,cpuacct:/user.slice/user-0.slice/session-623.scope
1:name=systemd:/user.slice/user-0.slice/session-623.scope
0::/user.slice/user-0.slice/session-623.scope
On Tue, 11 Aug 2020 at 03:08, Lennart Poettering <lennart@xxxxxxxxxxxxxx> wrote:
On Do, 06.08.20 13:59, David Cunningham (dcunningham@xxxxxxxxxxxxx) wrote:
> Hello,
>
> I'm developing a service called product_routed which is managed by systemd.
> The service can normally be stopped with "service product_routed stop" or
> "systemctl stop product_routed", however for some reason after the service
> has been running for a while (a few days or more) the stop command no
> longer works. Can anyone help me find why?
>
> When the application stop works initially (for the first day or two) we see
> a TERM signal sent to the application, as confirmed by logging in the
> application itself (which is written in perl), and is reported by "strace
> -p <pid> -e 'trace=!all'". However once the problem starts no signal is
> sent to the application at all when "service product_routed stop" or
> "systemctl stop product_routed" is run.
Note that on systemd for a unit that is already stopped issuing
another "systemctl stop" is a NOP and doesnt result in another SIGTERM
to be sent....
So, when you issue your second "systemctl stop", is the service
actually running in systemd's eyes? (i.e. what does "systemctl status"
say about the service?)
> The systemd file is as below, and we've confirmed that the PIDFile contains
> the correct PID when the stop is attempted. Would anyone have any
> suggestions on how to debug this? Thank you in advance.
>
> # cat /etc/systemd/system/product_routed.service
> [Unit]
> Description=Product routing daemon
> After=syslog.target network.target mysql.service
>
> [Service]
> Type=forking
> ExecStart=/opt/product/current/bin/routed
> PIDFile=/var/run/product/routed.pid
> Restart=on-abnormal
> RestartSec=1
> LimitSTACK=infinity
> LimitNOFILE=65535
> LimitNPROC=65535
>
> [Install]
> WantedBy=multi-user.target
Please provide the "sytemctl status" output when this happens.
Lennart
--
Lennart Poettering, Berlin
--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
_______________________________________________ systemd-devel mailing list systemd-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/systemd-devel