>>> Mantas Mikulenas <grawity@xxxxxxxxx> schrieb am 19.09.2022 um 19:25 in Nachricht <CAPWNY8XBcbT8EwDV9RvGb9u9=52AJ6EcUFYWxrL6-6i8P8480A@xxxxxxxxxxxxxx>: > Pipelines somewhat rely on the kernel delivering SIGPIPE to the writer as > soon as the read end is closed. So if you have `foo | head -1`, then as > soon as head reads enough and exits, foo gets killed via SIGPIPE. But as > most systemd-managed services aren't shell interpreters, systemd marks > SIGPIPE as "ignored" when starting the service process, so that if the > service is somehow tricked into opening a pipe that a user has mkfifo'd, at > least the kernel can't be tricked into killing the service. You can opt out > of this using IgnoreSIGPIPE=. > > (Though even if there's no signal, I believe the writer should also get an > -EPIPE out of every write attempt, but not all tools pay attention to it – > some just completely ignore the write() result, like apparently `fold` does > in your case...) Out of curiosity I tried an strace; maybe try the for diagosing, too: :~> strace -e trace=process -f perl /tmp/junk.pl execve("/usr/bin/perl", ["perl", "/tmp/junk.pl"], [/* 72 vars */]) = 0 arch_prctl(ARCH_SET_FS, 0x7fbf22d8d700) = 0 /tmp/junk.pl start 1663658370 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fbf22d8d9d0) = 13875 Process 13875 attached [pid 13875] execve("/bin/sh", ["sh", "-c", "cat /dev/urandom|tr -dc \"a-zA-Z0"...], [/* 72 vars */]) = 0 [pid 13875] arch_prctl(ARCH_SET_FS, 0x7f0f782c4700) = 0 [pid 13875] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f0f782c49d0) = 13876 Process 13876 attached [pid 13875] clone(Process 13877 attached child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f0f782c49d0) = 13877 [pid 13875] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f0f782c49d0) = 13878 Process 13878 attached [pid 13875] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f0f782c49d0) = 13879 [pid 13875] wait4(-1, <unfinished ...> [pid 13876] execve("/usr/bin/cat", ["cat", "/dev/urandom"], [/* 72 vars */]) = 0 [pid 13877] execve("/usr/bin/tr", ["tr", "-dc", "a-zA-Z0-9"], [/* 72 vars */]) = 0 [pid 13878] execve("/usr/bin/fold", ["fold", "-w", "64"], [/* 72 vars */]) = 0 [pid 13876] arch_prctl(ARCH_SET_FS, 0x7fcbd4cff700) = 0 [pid 13877] arch_prctl(ARCH_SET_FS, 0x7f48217fa700) = 0 [pid 13878] arch_prctl(ARCH_SET_FS, 0x7f3d58365700) = 0 Process 13879 attached [pid 13879] execve("/usr/bin/head", ["head", "-1"], [/* 72 vars */]) = 0 [pid 13879] arch_prctl(ARCH_SET_FS, 0x7f0e2bc01700) = 0 [pid 13878] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=13878, si_uid=1025} --- [pid 13878] +++ killed by SIGPIPE +++ [pid 13875] <... wait4 resumed> [{WIFSIGNALED(s) && WTERMSIG(s) == SIGPIPE}], 0, NULL) = 13878 [pid 13875] wait4(-1, <unfinished ...> [pid 13879] exit_group(0) = ? [pid 13877] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=13877, si_uid=1025} --- [pid 13879] +++ exited with 0 +++ [pid 13877] +++ killed by SIGPIPE +++ [pid 13875] <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 13879 [pid 13875] wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGPIPE}], 0, NULL) = 13877 [pid 13875] wait4(-1, <unfinished ...> [pid 13876] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=13876, si_uid=1025} --- [pid 13876] +++ killed by SIGPIPE +++ [pid 13875] <... wait4 resumed> [{WIFSIGNALED(s) && WTERMSIG(s) == SIGPIPE}], 0, NULL) = 13876 [pid 13875] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=13878, si_uid=1025, si_status=SIGPIPE, si_utime=0, si_stime=0} --- [pid 13875] wait4(-1, 0x7fff15c17600, WNOHANG, NULL) = -1 ECHILD (No child processes) [pid 13875] exit_group(0) = ? [pid 13875] +++ exited with 0 +++ --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=13875, si_uid=1025, si_status=0, si_utime=0, si_stime=0} --- wait4(13875, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 13875 /tmp/junk.pl end 1663658370 exit_group(0) = ? +++ exited with 0 +++ Regards, Ulrich > > On Mon, Sep 19, 2022, 20:18 Brian Reichert <reichert@xxxxxxxxxxx> wrote: > >> I apologize for the vague subject. >> >> The background: I've inherited some legacy software to manage. >> >> This is on SLES12 SP5, running: >> >> systemd-228-157.40.1.x86_64 >> >> One element is a systemd-managed service, written in Perl, that in >> turn, is using bash to generate random numbers (don't ask me why >> this tactic was adopted). >> >> Here's an isolation of that logic: >> >> pheonix:~ # cat /root/random_str.pl >> #!/usr/bin/perl >> print "$0 start ".time."\n"; >> my $randStr = `cat /dev/urandom|tr -dc "a-zA-Z0-9"|fold -w 64|head -1`; >> print "$0 end ".time."\n"; >> >> You can run this from the command-line, to see how quickly it >> nominally operates. >> >> What I can reproduce in my environment, very reliably, is that when >> this is invoked as a service: >> >> - the 'head' command exits very quickly (to be expected) >> - the shell does not exit (maybe missed a SIGCHILD?) >> - 'fold' chews a CPU core >> - A kernel trace shows that 'fold' is spinning on SIGPIPEs, as it's >> STDOUT is no longer connected to another process. >> >> My service unit: >> >> pheonix:~ # cat /etc/systemd/system/random_str.service >> [Unit] >> Description=gernate random number >> After=network.target local-fs.target >> >> [Service] >> Type=oneshot >> RemainAfterExit=yes >> ExecStart=/root/random_str.pl >> ExecStop=/usr/bin/true >> #TimeoutSec=infinity >> TimeoutSec=900 >> >> [Install] >> WantedBy=multi-user.target >> >> Easy to repro; this hangs forever, instead of exiting quickly. >> >> pheonix:~ # systemctl daemon-reload >> pheonix:~ # systemctl start random_str >> >> Let me know if there are any other details of my environment that >> would be helpful here. >> >> -- >> Brian Reichert <reichert@xxxxxxxxxxx> >> BSD admin/developer at large >>