Antw: [EXT] Re: systemd service causing bash to miss signals?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>> Mantas Mikulenas <grawity@xxxxxxxxx> schrieb am 19.09.2022 um 19:25 in
Nachricht
<CAPWNY8XBcbT8EwDV9RvGb9u9=52AJ6EcUFYWxrL6-6i8P8480A@xxxxxxxxxxxxxx>:
> Pipelines somewhat rely on the kernel delivering SIGPIPE to the writer as
> soon as the read end is closed. So if you have `foo | head -1`, then as
> soon as head reads enough and exits, foo gets killed via SIGPIPE. But as
> most systemd-managed services aren't shell interpreters, systemd marks
> SIGPIPE as "ignored" when starting the service process, so that if the
> service is somehow tricked into opening a pipe that a user has mkfifo'd, at
> least the kernel can't be tricked into killing the service. You can opt out
> of this using IgnoreSIGPIPE=.
> 
> (Though even if there's no signal, I believe  the writer should also get an
> -EPIPE out of every write attempt, but not all tools pay attention to it –
> some just completely ignore the write() result, like apparently `fold` does
> in your case...)

Out of curiosity I tried an strace; maybe try the for diagosing, too:
:~> strace -e trace=process -f perl /tmp/junk.pl
execve("/usr/bin/perl", ["perl", "/tmp/junk.pl"], [/* 72 vars */]) = 0
arch_prctl(ARCH_SET_FS, 0x7fbf22d8d700) = 0
/tmp/junk.pl start 1663658370
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7fbf22d8d9d0) = 13875
Process 13875 attached
[pid 13875] execve("/bin/sh", ["sh", "-c", "cat /dev/urandom|tr -dc
\"a-zA-Z0"...], [/* 72 vars */]) = 0
[pid 13875] arch_prctl(ARCH_SET_FS, 0x7f0f782c4700) = 0
[pid 13875] clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f0f782c49d0) = 13876
Process 13876 attached
[pid 13875] clone(Process 13877 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f0f782c49d0) = 13877
[pid 13875] clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f0f782c49d0) = 13878
Process 13878 attached
[pid 13875] clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f0f782c49d0) = 13879
[pid 13875] wait4(-1,  <unfinished ...>
[pid 13876] execve("/usr/bin/cat", ["cat", "/dev/urandom"], [/* 72 vars */]) =
0
[pid 13877] execve("/usr/bin/tr", ["tr", "-dc", "a-zA-Z0-9"], [/* 72 vars */])
= 0
[pid 13878] execve("/usr/bin/fold", ["fold", "-w", "64"], [/* 72 vars */]) =
0
[pid 13876] arch_prctl(ARCH_SET_FS, 0x7fcbd4cff700) = 0
[pid 13877] arch_prctl(ARCH_SET_FS, 0x7f48217fa700) = 0
[pid 13878] arch_prctl(ARCH_SET_FS, 0x7f3d58365700) = 0
Process 13879 attached
[pid 13879] execve("/usr/bin/head", ["head", "-1"], [/* 72 vars */]) = 0
[pid 13879] arch_prctl(ARCH_SET_FS, 0x7f0e2bc01700) = 0
[pid 13878] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=13878,
si_uid=1025} ---
[pid 13878] +++ killed by SIGPIPE +++
[pid 13875] <... wait4 resumed> [{WIFSIGNALED(s) && WTERMSIG(s) == SIGPIPE}],
0, NULL) = 13878
[pid 13875] wait4(-1,  <unfinished ...>
[pid 13879] exit_group(0)               = ?
[pid 13877] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=13877,
si_uid=1025} ---
[pid 13879] +++ exited with 0 +++
[pid 13877] +++ killed by SIGPIPE +++
[pid 13875] <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0,
NULL) = 13879
[pid 13875] wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGPIPE}], 0, NULL) =
13877
[pid 13875] wait4(-1,  <unfinished ...>
[pid 13876] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=13876,
si_uid=1025} ---
[pid 13876] +++ killed by SIGPIPE +++
[pid 13875] <... wait4 resumed> [{WIFSIGNALED(s) && WTERMSIG(s) == SIGPIPE}],
0, NULL) = 13876
[pid 13875] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=13878,
si_uid=1025, si_status=SIGPIPE, si_utime=0, si_stime=0} ---
[pid 13875] wait4(-1, 0x7fff15c17600, WNOHANG, NULL) = -1 ECHILD (No child
processes)
[pid 13875] exit_group(0)               = ?
[pid 13875] +++ exited with 0 +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=13875, si_uid=1025,
si_status=0, si_utime=0, si_stime=0} ---
wait4(13875, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 13875
/tmp/junk.pl end 1663658370
exit_group(0)                           = ?
+++ exited with 0 +++

Regards,
Ulrich

> 
> On Mon, Sep 19, 2022, 20:18 Brian Reichert <reichert@xxxxxxxxxxx> wrote:
> 
>> I apologize for the vague subject.
>>
>> The background: I've inherited some legacy software to manage.
>>
>> This is on SLES12 SP5, running:
>>
>>         systemd-228-157.40.1.x86_64
>>
>> One element is a systemd-managed service, written in Perl, that in
>> turn, is using bash to generate random numbers (don't ask me why
>> this tactic was adopted).
>>
>> Here's an isolation of that logic:
>>
>>   pheonix:~ # cat /root/random_str.pl
>>   #!/usr/bin/perl
>>   print "$0 start ".time."\n";
>>   my $randStr = `cat /dev/urandom|tr -dc "a-zA-Z0-9"|fold -w 64|head -1`;
>>   print "$0 end ".time."\n";
>>
>> You can run this from the command-line, to see how quickly it
>> nominally operates.
>>
>> What I can reproduce in my environment, very reliably, is that when
>> this is invoked as a service:
>>
>> - the 'head' command exits very quickly (to be expected)
>> - the shell does not exit (maybe missed a SIGCHILD?)
>> - 'fold' chews a CPU core
>> - A kernel trace shows that 'fold' is spinning on SIGPIPEs, as it's
>>   STDOUT is no longer connected to another process.
>>
>> My service unit:
>>
>>   pheonix:~ # cat /etc/systemd/system/random_str.service
>>   [Unit]
>>   Description=gernate random number
>>   After=network.target local-fs.target
>>
>>   [Service]
>>   Type=oneshot
>>   RemainAfterExit=yes
>>   ExecStart=/root/random_str.pl
>>   ExecStop=/usr/bin/true
>>   #TimeoutSec=infinity
>>   TimeoutSec=900
>>
>>   [Install]
>>   WantedBy=multi-user.target
>>
>> Easy to repro; this hangs forever, instead of exiting quickly.
>>
>>   pheonix:~ # systemctl daemon-reload
>>   pheonix:~ # systemctl start random_str
>>
>> Let me know if there are any other details of my environment that
>> would be helpful here.
>>
>> --
>> Brian Reichert                          <reichert@xxxxxxxxxxx>
>> BSD admin/developer at large
>>






[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux