Re: [PATCH] parse_object: try internal cache before reading object db

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 05, 2012 at 01:35:50PM -0800, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > This might seem circular, since step 2a uses the type
> > information determined in step 1 to call the appropriate
> > lookup function. However, we can notice that all of the
> > lookup_* functions are backed by lookup_object. In other
> > words, all of the objects are kept in a master hash table,
> > and we don't actually need the type to do the "do we have
> > it" part of the lookup,...
> 
> The only case that might matter is where you read one object, you have
> written another object of a different type but that happens to hash to the
> same SHA-1 value. The other existing optimizations do not take that into
> account, so I do not think there is any new issue here.

Yeah, I tried to think of issues like that. Even if you protected
against that, you'd still have the issue of reading one object, then
writing another of the _same_ type but with different content. We
wouldn't notice with the current code path (we'd just recreationally
read the data from disk and then throw it away).

The worst potential problem I could come up with is if you somehow had
an object whose "parsed" flag was set, but somehow didn't have its other
fields set (like type). But I think you'd have to be abusing the lookup
functions pretty hard to get into such a state (how would you be parsing
if you didn't know the type?). The parsed flag only gets set by the
type-specific lookup functions.

So I think it is safe short of somebody doing some horrible manual
munging of a "struct object".

> > For example, GitHub's alternates repository for git.git has
> > ~120,000 refs, of which only ~3200 are unique. The time for
> > upload-pack to print its list of advertised refs dropped
> > from 3.4s to 0.76s.
> 
> Nice. I am more impressed by 120k/3.4 than 3.2k/0.76, though ;-)

You can thank optimized zlib for that. We spent 60% of our time there.
:)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]