Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 20, 2017 at 12:47:53PM -0800, Stefan Beller wrote:

> > This is the minimal fix that addresses the performance issues.
> > I'd actually have no problem at all declaring that looking up a null
> > sha1 is insane, and having the object-lookup routines simply return "no
> > such object" without even doing the loose/pack lookup first.
> >
> > I'm also fine with going down the BUG() route and fixing the
> > higher-level callers, but it's a big enough task (with little enough
> > real-world impact) that I think it would be worth applying this in the
> > meantime.
> 
> It would have a lot of impact in the future, when new developers
> are hindered mis-using the API. The (unwritten) series with introducing
> BUG() would help a lot in 'holding it right' and I would expect fewer
> performance
> regressions over time.

Yeah, I agree the BUG() route is the nicest. I started trying to hunt
down every code path causing a test failure, though, and soon got
overwhelmed.

I actually think it's not the worst thing in the world to say "it's OK
to look up the null sha1; the result is well-defined, and it does not
exist". It's perhaps philosophically a little ugly, but I think it's
quite practical.

> The patch is impressively small for such a performance gain.
> Personally I think (1) (which essentially means "making null sha1
> work like a regular sha1") is quite an endeavor at this point in time
> for this code base.

Yeah, it would be a crazy amount of effort to try to support that, for
very little gain. If we did want to just declare that the null sha1 does
not exist (even if you happen to have objects/00/00... in your
repository), the patch is similarly easy:

diff --git a/sha1_file.c b/sha1_file.c
index 8a7c6b7eba..754fe7f03e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1152,6 +1152,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
 				    lookup_replace_object(sha1) :
 				    sha1;
 
+	if (is_null_sha1(sha1))
+		return -1;
+
 	if (!oi)
 		oi = &blank_oi;
 

> As a tangent, a similar but solved problem in the diff code is how
> NUL in user data is treated in xdiff for example, as there we kept
> being careful since the beginning (though I think we don't have tests
> for it, so it might be broken)

There I think it's nice if we can be tolerant of NULs, because they're a
thing that might actually happen in user input. A real null sha1 has a
2^-160 chance of happening, and the design of sha1 is such that it's
hard for somebody to do it intentionally.

You might be able to get up to some mischief with replace objects,
though.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux