On Thu, Sep 16, 2021 at 5:17 AM Rolf Eike Beer <eb@xxxxxxxxx> wrote: > > Am Donnerstag, 16. September 2021, 12:12:48 CEST schrieb Tobias Ulmer: > > > The redirection seems to be an important part of it. I now did: > > > > > > git ... 2>&1 | sha256sum > > > > I've tried to reproduce this since yesterday, but couldn't until now: > > > > 2>&1 made all the difference, took less than a minute. So if that redirection is what matters, and what causes problems, I can almost guarantee that the reason is very simple: Your git repository (or more likely your upstream) has some problem, it's getting reported on stderr, and because you mix stdout and stderr with that '2>&1', you get randomly mixed output. Then it depends on timing where the mixing happens. Or rather, it depends on various different factors, like the buffering done internally by stdio (where stdout generally will be block-buffered, while stderr is usually line-buffered, which is why you get odd mixing of the two). But timing can be an effect particularly with "git ls-remote" and friends, because you may get errors from the transport asynchronously. So the different buffering ends up causing the effect of mixing things in the middle of lines, while the timing differences due to the asynchronous nature of the remote access pipeline will likely then cause that odd mixing to be different. End result: corrupted lines, and different sha256sum every time. > > Running the same on Archlinux (5.13.13-arch1-1, 2.33.0) doesn't show the > > problem. > > This may well turn out not to be git, but a kernel issue. Much more likely that the other box just doesn't have the error situation. > since you have been hacking around in pipe.c recently, I fear this isn't > entirely impossible. Have you any idea? Almost certainly not the kernel. Kernel - and other - differences could affect timing, of course, but the whole "2>&1" really is fundamentally bogus. If you don't have any errors, then the "2>&1" doesn't matter. And if you *do* have errors, then by definition the "2>&1" will mix in the errors with the output randomly and piping them together is senseless. Either way, it's wrong. So what I'd suggest Tobias should do is git ... 2> err | sha256sum which will send the errors to the "err" file. Take a look at that file afterwards and see what is in it. Basically, '2&>1" is almost never the right thing to do, unless you explicitly don't care about the output and just want to suppress it. So "2&>1 > /dev/null" is common and natural. Of course, people also use it when they just want to eyeball the errors mixed in, so doing that ... 2&>1 | less thing isn't necessarily *wrong*, but it's somewhat dangerous and confusing. Because when you do it you do need to be very aware of the fact that the errors and output will be *mixed*. And the mixing will not necessarily be at all sensible. Finally: pipes on a low level guarantee certain atomicity constraints, so if you do low-level "write()" calls of size PIPE_BUF or less, the contents will not be interleaved randomly. HOWEVER. That's only true at that "write()" level. The moment you use <stdio> for your IO, you have buffering inside of the standard IO libraries, and if your code isn't explicitly very careful about it, using setbuf() and fflush() and friends, you'll get that random mixing. Anyway. That was a long email just to tell people it's almost certainly user error, not the kernel. Linus