Re: [PATCH v2 0/4] Don't lazy-fetch commits when parsing them

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:
> Yeah, I don't see any difference in the new caller versus what the
> original was doing. The errno we care about comes from inside
> oid_object_info_extended(). So in any case, we'll see at least
> obj_read_unlock() followed by obj_read_lock() between the syscalls of
> interest and this check. And I don't even really see any indication that
> oid_object_info_extended() tries to set or preserve errno itself. The
> likely sequence is:
> 
>   - find_pack_entry() fails to find it; errno isn't set at all
>   - loose_object_info() tries to open it and probably gets ENOENT
>   - we check find_pack_entry() again after reprepare_packed_git()
>   - that fails so we return -1, barring submodule or partial clone
>     tricks
> 
> So it really seems like we're quite likely to get an errno from opening
> or mapping packs. Which implies the original suffers from the same
> issue, but we simply never triggered it meaningfully in a test.

Thanks for checking. I'm still not sure how the current code passes CI, but my
patches don't. 

> I'm not entirely sure on just removing the check. It comes from
> 3ba7a06552 (A loose object is not corrupt if it cannot be read due to
> EMFILE, 2010-10-28), so we'd lose what that commit is trying to do.
> Though I think even back then, I think it would have suffered from the
> same problems (minus the lock/unlock; I'm still unclear which syscall is
> the actual culprit here).

Ah, thanks for the pointer to that commit. Without that, my patch would report
corruption even if the real issue was EMFILE, as the commit message of that
commit describes.

> If we assume that errno from reading the object isn't reliable, I think
> you'd have to actually re-check things. Something like:
> 
>   if (find_pack_entry(...) || !stat_loose_object(...))
>     /* ok, it's not missing */
> 
> but of course we don't have the actual errno that _did_ cause us to
> fail, which makes the error message we'd print a lot less useful. Maybe
> this check should be ditched and we should complain much closer to the
> source of the problem:
> 
> diff --git a/object-file.c b/object-file.c
> index 26290554bb..743ba8210e 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1460,8 +1460,12 @@ static int loose_object_info(struct repository *r,
>  	}
>  
>  	map = map_loose_object(r, oid, &mapsize);
> -	if (!map)
> +	if (!map) {
> +		if (errno != ENOENT)
> +			error_errno("unable to open loose object %s",
> +				    oid_to_hex(oid));
>  		return -1;
> +	}
>  
>  	if (!oi->sizep)
>  		oi->sizep = &size_scratch;
> 
> That might make things more verbose for other code paths, but that kind
> of seems like a good thing. If you have an object file that we can't
> open, we probably _should_ be complaining loudly about it.
> 
> We may need to be a little more careful about preserving errno in
> map_loose_object_1(), though (gee, another place where the existing
> check could run into trouble).

Besides needing to be careful in map_loose_object_1(), I'm not sure if this
fully solves the problem. This is non-fatal, so the EMFILE commit's work would
still remain undone. If this were made fatal, I think this would change the
behavior of too much code, especially those that can tolerate loose objects
being missing.
 
What do you think of not putting any die_if_corrupt() calls in the commit
parsing code at all? The error message printed would then be different (just a
generic message about being unable to parse a commit, versus the specific one
here) but it does pass CI [1]. Also, I don't think that we should be doing errno
diagnostics separate from what causes the errno anyway.

[1] https://github.com/jonathantanmy/git/actions/runs/3624495729



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux