Re: Output from `tee' is not showing up in system journal consistently on some systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 27, 2021 at 1:52 AM Colin Guthrie <gmane@xxxxxxxxxxxxxx> wrote:
>
> Mitchel Humpherys wrote on 26/10/2021 21:16:
> > On my Manjaro and Ubuntu systems I'm not seeing all of the "pizza" lines
> > (logs pasted here [3]). On Debian I see the "pizza" and "pie" lines
> > interleaved as expected. It also works as expected when I run the script
> > without systemd on Manjaro.
> >
> > Any ideas what's going on here? Why am I not seeing all of the output
> > from `tee' on some systems?
> >
> > [2] https://github.com/mgalgs/systemd-tee-output-issue
>
> Looking at the output there, the only thing I can see that's slightly
> odd is that the "pie" output comes from the service's main pid (as
> expected) but as tee is forked off the output there that *is* captured
> is from a different pid each time.
>
> So I wonder if this is some kind of cgroup race thing? i.e. the forked
> off process is not (always) detected as being part of the unit in
> question and thus isn't logged against the unit? If this is the case is
> the a bash trick you can use to buffer the output back to the parent
> process - e.g. just surrounding the command in parenthesis or something?
> (I'm not a bash expert so no idea if this would work as a workaround!).
> Appreciate any code change to work around this isn't ideal anyway, but
> just trying to help debug the underlying problem.
>
> This is just a random guess and might not at all be how the logging or
> cgroup handing works!

Interesting thought! I think it's totally expected for `tee' to have a
different PID. Your comment got me curious about the cgroup situation
in the working vs. non-working scenarios, but it does appear that the
`tee' processes are appropriately cgroup'd. I'm pasting some mid-tee
`status' output below (captured with `while :; do systemctl --user
status tee_test.service; done > /tmp/statuses.txt' and some grepping).

First, for reference, here's a mid-tee `status' from my Debian 11
environment where logging is working as expected:

● tee_test.service - Test teeing under systemd
     Loaded: loaded
(/home/mgalgs/.config/systemd/user/tee_test.service; disabled; vendor
preset: enabled)
     Active: active (running) since Tue 2021-10-26 11:44:07 PDT; 21h ago
   Main PID: 487 (tee_test.sh)
      Tasks: 2 (limit: 4679)
     Memory: 3.2M
        CPU: 5min 49.049s
     CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/tee_test.service
             ├─   487 /bin/bash /usr/local/bin/tee_test.sh
             └─229327 tee -a /tmp/testtee.log

Oct 27 08:57:46 pi-hole-test tee_test.sh[229282]: i shall eat 75937
slices of pizza
Oct 27 08:57:46 pi-hole-test tee_test.sh[487]: i shall eat 75937 slices of pie
Oct 27 08:57:47 pi-hole-test tee_test.sh[229293]: i shall eat 75938
slices of pizza
Oct 27 08:57:47 pi-hole-test tee_test.sh[487]: i shall eat 75938 slices of pie
Oct 27 08:57:48 pi-hole-test tee_test.sh[229304]: i shall eat 75939
slices of pizza
Oct 27 08:57:48 pi-hole-test tee_test.sh[487]: i shall eat 75939 slices of pie
Oct 27 08:57:49 pi-hole-test tee_test.sh[229315]: i shall eat 75940
slices of pizza
Oct 27 08:57:49 pi-hole-test tee_test.sh[487]: i shall eat 75940 slices of pie
Oct 27 08:57:50 pi-hole-test tee_test.sh[229327]: i shall eat 75941
slices of pizza
Oct 27 08:57:50 pi-hole-test tee_test.sh[487]: i shall eat 75941 slices of pie

And here are two `status'es from my Manjaro environment, where logging
isn't working consistently. In the first case the logs were missing in
the system journal. The second case was one of the odd cases where the
logs actually came through.

Case 1 (logs missing):
Status:
● tee_test.service - Test teeing under systemd
     Loaded: loaded
(/home/mgalgs/.config/systemd/user/tee_test.service; disabled; vendor
preset: enabled)
     Active: active (running) since Wed 2021-10-27 08:55:04 PDT; 15min ago
   Main PID: 3791090 (tee_test.sh)
      Tasks: 2 (limit: 38124)
     Memory: 1.0M
        CPU: 3.874s
     CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/tee_test.service
             ├─3791090 /bin/bash
/home/mgalgs/src/systemd-tee-output-issue/tee_test.sh
             └─3841065 tee -a /tmp/testtee.log

Oct 27 09:10:11 grinchel tee_test.sh[3791090]: i shall eat 906 slices of pie
Oct 27 09:10:12 grinchel tee_test.sh[3791090]: i shall eat 907 slices of pie
Oct 27 09:10:13 grinchel tee_test.sh[3791090]: i shall eat 908 slices of pie
Oct 27 09:10:14 grinchel tee_test.sh[3791090]: i shall eat 909 slices of pie
Oct 27 09:10:15 grinchel tee_test.sh[3791090]: i shall eat 910 slices of pie
Oct 27 09:10:16 grinchel tee_test.sh[3791090]: i shall eat 911 slices of pie
Oct 27 09:10:17 grinchel tee_test.sh[3791090]: i shall eat 912 slices of pie
Oct 27 09:10:18 grinchel tee_test.sh[3791090]: i shall eat 913 slices of pie
Oct 27 09:10:19 grinchel tee_test.sh[3791090]: i shall eat 914 slices of pie
Oct 27 09:10:20 grinchel tee_test.sh[3791090]: i shall eat 915 slices of pie


Case 2 (logs present):
Status:
● tee_test.service - Test teeing under systemd
     Loaded: loaded
(/home/mgalgs/.config/systemd/user/tee_test.service; disabled; vendor
preset: enabled)
     Active: active (running) since Wed 2021-10-27 08:55:04 PDT; 13min ago
   Main PID: 3791090 (tee_test.sh)
      Tasks: 2 (limit: 38124)
     Memory: 768.0K
        CPU: 3.541s
     CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/tee_test.service
             ├─3791090 /bin/bash
/home/mgalgs/src/systemd-tee-output-issue/tee_test.sh
             ├─3836703 /bin/bash
/home/mgalgs/src/systemd-tee-output-issue/tee_test.sh
             └─3836704 tee -a /tmp/testtee.log

Oct 27 09:08:58 grinchel tee_test.sh[3791090]: i shall eat 833 slices of pie
Oct 27 09:08:59 grinchel tee_test.sh[3791090]: i shall eat 834 slices of pie
Oct 27 09:09:00 grinchel tee_test.sh[3836242]: i shall eat 835 slices of pizza
Oct 27 09:09:00 grinchel tee_test.sh[3791090]: i shall eat 835 slices of pie
Oct 27 09:09:01 grinchel tee_test.sh[3836366]: i shall eat 836 slices of pizza
Oct 27 09:09:01 grinchel tee_test.sh[3791090]: i shall eat 836 slices of pie
Oct 27 09:09:02 grinchel tee_test.sh[3791090]: i shall eat 837 slices of pie
Oct 27 09:09:03 grinchel tee_test.sh[3791090]: i shall eat 838 slices of pie
Oct 27 09:09:04 grinchel tee_test.sh[3836704]: i shall eat 839 slices of pizza
Oct 27 09:09:04 grinchel tee_test.sh[3791090]: i shall eat 839 slices of pie

The extra /bin/bash process in the working Manjaro case is kind of
interesting... This must be the pipe which spawns the `tee' (note the
PID adjacency), although I'm not quite sure why the process is still
there after the `tee' is already live (I assume it uses execv* and
thus would have expected it to be gone by the time `tee' comes up).

In any case, I have a hunch that this may have been simple lucky
timing where I caught the pipe (especially since the working Debian 11
case doesn't have the additional process), but I haven't managed to
catch the pipe in any of the non-working cases on Manjaro thus far...
:thinking_face:




[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux