On 29/09/2017 00:47, Junio C Hamano wrote: > [jch: a patch that hopefully is applicable is attached at the end; > I'd appreciate input from Paolo, the contributor of the original > upstream] I wasn't involved in upstream commit d42461c3, but Linux is also always overwriting the revents output with zero. Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> Paolo > "Randall S. Becker" <rsbecker@xxxxxxxxxxxxx> writes: > >> After a whole lot of investigating, we (it is a large "we") have discovered >> the reason for the hang we occasionally get in git-upload-pack on HPE >> NonStop servers - reported here well over a year ago. This resulted from a >> subtle check that the operating system does on file descriptors. When it >> sees random values in pfd[i].revents,... > > What do you mean by "random values"? Where do these random values > come from? The original code the patch touches is _not_ setting > anything, so it is leaving it uninitialized and that is seen by the > system? If that is the case perhaps should we clear it before we > call into this codepath? > >> .... There is a non-destructive fix >> that I would like to propose for this that I have already tested. > > I am not sure in what sense this is "non-destructive"; we left the > value as it was when we didn't notice anything happening in > compute_revents(). Now we explicitly destroy the value there was > (just like we do in the case where the corresponding file descriptor > was negative). Maybe overwriting with 0 is the right thing, but I > wouldn't call it "non-destructive", though. Puzzling... > >> --- a/compat/poll/poll.c >> +++ b/compat/poll/poll.c >> @@ -438,6 +438,10 @@ poll (struct pollfd *pfd, nfds_t nfd, int timeout) >> pfd[i].revents = happened; >> rc++; >> } >> + else >> + { >> + pfd[i].revents = 0; >> + } >> } >> >> return rc; > > FYI, the upstream gnulib rewrites this part of the code with > d42461c3 ("poll: fixes for large fds", 2015-02-20) quite a bit, and > it reads like this: > > + { > + pfd[i].revents = (pfd[i].fd < 0 > + ? 0 > + : compute_revents (pfd[i].fd, pfd[i].events, > + &rfds, &wfds, &efds)); > + rc += pfd[i].revents != 0; > + } > > The code before your fix (and before d42461c3) used to assign to > pfd[i].revents only when compute_revents() gave us non-zero value, > but with d42461c3 in the upstream, it pfd[i].revents is assigned > the return value from compute_revents() even if the value returned > is 0. And rc is incremented only when that value is non-zero. > > The result of applying your patch is equivalent, so after all, I > tend to think that your patch is the right fix in the context of the > code we have (i.e. the older code we borrowed from them). > > I wonder if we want to refresh the borrowed copy of poll.c to a > newer snapshot someday, but that is totally a separate topic. At > least, I now know that your fix in this patch will not be broken due > to d42461c3 updating the code in a slightly different way ;-) > > Randall, I forged your Sign-off in the patch below, but please say > it is OK, after reading Documentation/SubmittingPatches. > > Thanks. > > -- >8 -- > From: Randall S. Becker <rsbecker@xxxxxxxxxxxxx> > Subject: poll.c: always set revents, even if to zero. > > Match what happens to pfd[i].revents when compute_revents() returns > 0 to the upstream gnulib's commit d42461c3 ("poll: fixes for large > fds", 2015-02-20). The revents field is set to 0, without > incrementing the value rc to be returned from the function. > > This fixes occasional hangs in git-upload-pack on NPE NonStop. > > Signed-off-by: Randall S. Becker <rsbecker@xxxxxxxxxxxxx> > Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx> > --- > compat/poll/poll.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/compat/poll/poll.c b/compat/poll/poll.c > index b10adc780f..ae03b74a6f 100644 > --- a/compat/poll/poll.c > +++ b/compat/poll/poll.c > @@ -438,6 +438,10 @@ poll (struct pollfd *pfd, nfds_t nfd, int timeout) > pfd[i].revents = happened; > rc++; > } > + else > + { > + pfd[i].revents = 0; > + } > } > > return rc; >