Jeff King <peff@xxxxxxxx> writes: > On Thu, Dec 01, 2022 at 11:27:29AM -0800, Jonathan Tan wrote: > > > Thanks everyone for your reviews. Here is a reroll with the requested change > > (just one small one). > > Thanks, this looks OK to me. However Junio noted in "What's cooking" > that it seems to break CI on windows. The problem is in t5318.93: > > 2022-12-01T09:26:44.8887018Z ++ cat test_err > 2022-12-01T09:26:44.8887414Z error: Could not read 0000000000000000000000000000000000000000 > 2022-12-01T09:26:44.8887825Z error: Could not read 0000000000000000000000000000000000000000 > 2022-12-01T09:26:44.8888240Z error: Could not read 0000000000000000000000000000000000000000 > 2022-12-01T09:26:44.8888639Z error: Could not read 0000000000000000000000000000000000000000 > 2022-12-01T09:26:44.8889052Z error: Could not read 0000000000000000000000000000000000000000 > 2022-12-01T09:26:44.8889512Z error: Could not read 0000000000000000000000000000000000000000 > 2022-12-01T09:26:44.8889991Z fatal: failed to read object 0000000000000000000000000000000000000000: Function not implemented > 2022-12-01T09:26:44.8890401Z ++ return 1 > 2022-12-01T09:26:44.8890761Z error: last command exited with $?=1 > 2022-12-01T09:26:44.8891263Z not ok 93 - corrupt commit-graph write (broken parent) > > Looks like the check in die_if_corrupt() is seeing a different errno > value than ENOENT. I wonder if we need to take more care to preserve it > across calls. It does look like we hit the same sequence of functions > that read_object_file_extended() did, but perhaps this was buggy all > along, and you're now exposing it through a new code path. > > In particular I wonder if obj_read_unlock() might be the culprit here, > and something like this might help: > > diff --git a/object-file.c b/object-file.c > index 8adef99a7c..db2d35519e 100644 > --- a/object-file.c > +++ b/object-file.c > @@ -1641,9 +1641,12 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid, > struct object_info *oi, unsigned flags) > { > int ret; > + int save_errno; > obj_read_lock(); > ret = do_oid_object_info_extended(r, oid, oi, flags); > + save_errno = errno; > obj_read_unlock(); > + errno = save_errno; > return ret; > } Copying die_if_corrupt() until "failed to read object": > 1734 void die_if_corrupt(struct repository *r, > 1735 const struct object_id *oid, > 1736 const struct object_id *real_oid) > 1737 { > 1738 const struct packed_git *p; > 1739 const char *path; > 1740 struct stat st; > 1741 > 1742 obj_read_lock(); > 1743 if (errno && errno != ENOENT) > 1744 die_errno(_("failed to read object %s"), oid_to_hex(oid)); I wonder if we could just remove this check. Even as it is, I don't think that there is any guarantee that obj_read_lock() would not clobber errno. Removing it makes all tests pass locally, but I haven't tried it on CI. (One argument that could be made is that we shouldn't have any die_if_corrupt() refactoring or other refactoring of the sort, because previously its contents was part of a function and it could thus rely on the errno of what has happened previously. But I think that even without my patches, we couldn't rely on it in the first place - looking at obj_read_lock(), it looks like it could init a mutex, and depending on the implementation of that, it could clobber errno.)