Pipelines somewhat rely on the kernel delivering SIGPIPE to the writer as soon as the read end is closed. So if you have `foo | head -1`, then as soon as head reads enough and exits, foo gets killed via SIGPIPE. But as most systemd-managed services aren't shell interpreters, systemd marks SIGPIPE as "ignored" when starting the service process, so that if the service is somehow tricked into opening a pipe that a user has mkfifo'd, at least the kernel can't be tricked into killing the service. You can opt out of this using IgnoreSIGPIPE=.
(Though even if there's no signal, I believe the writer should also get an -EPIPE out of every write attempt, but not all tools pay attention to it – some just completely ignore the write() result, like apparently `fold` does in your case...)
On Mon, Sep 19, 2022, 20:18 Brian Reichert <reichert@xxxxxxxxxxx> wrote:
I apologize for the vague subject.
The background: I've inherited some legacy software to manage.
This is on SLES12 SP5, running:
systemd-228-157.40.1.x86_64
One element is a systemd-managed service, written in Perl, that in
turn, is using bash to generate random numbers (don't ask me why
this tactic was adopted).
Here's an isolation of that logic:
pheonix:~ # cat /root/random_str.pl
#!/usr/bin/perl
print "$0 start ".time."\n";
my $randStr = `cat /dev/urandom|tr -dc "a-zA-Z0-9"|fold -w 64|head -1`;
print "$0 end ".time."\n";
You can run this from the command-line, to see how quickly it
nominally operates.
What I can reproduce in my environment, very reliably, is that when
this is invoked as a service:
- the 'head' command exits very quickly (to be expected)
- the shell does not exit (maybe missed a SIGCHILD?)
- 'fold' chews a CPU core
- A kernel trace shows that 'fold' is spinning on SIGPIPEs, as it's
STDOUT is no longer connected to another process.
My service unit:
pheonix:~ # cat /etc/systemd/system/random_str.service
[Unit]
Description=gernate random number
After=network.target local-fs.target
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/root/random_str.pl
ExecStop=/usr/bin/true
#TimeoutSec=infinity
TimeoutSec=900
[Install]
WantedBy=multi-user.target
Easy to repro; this hangs forever, instead of exiting quickly.
pheonix:~ # systemctl daemon-reload
pheonix:~ # systemctl start random_str
Let me know if there are any other details of my environment that
would be helpful here.
--
Brian Reichert <reichert@xxxxxxxxxxx>
BSD admin/developer at large