Re: [PATCH] check_refname_component: Optimize

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 30, 2014 at 7:07 AM, Jeff King <peff@xxxxxxxx> wrote:
> But then we're just trusting that the "trust me" flag on disk is
> correct. Why not just trust that packed-refs is correct in the first
> place?
>
> IOW, consider this progression of changes:
>
>   1. Check refname format when we read packed-refs (the current
>      behavior).
>
>   2. Keep a separate file "packed-refs.stat" with stat information. If
>      the packed-refs file matches that stat information, do not bother
>      checking refname formats.
>
>   3. Put a flag in "packed-refs" that says "trust me, I'm valid". Check
>      the refnames when it is generated.
>
>   4. Realize that we already check the refnames when we write it out.
>      Don't bother writing "trust me, I'm valid"; readers can assume that
>      it is.
>
> What is the scenario that option (2) protects against that options (3)
> and (4) do not?
>
> I could guess something like "the writer has a different idea of what a
> valid refname is than we do". But that applies as well to (2), but just
> as "the reader who wrote packed-refs.stat has a different idea than we
> do".

The reader and the writer have to agree on the same "valid" definition
or it wouldn't work. I don't suppose this packed-refs.stat idea would
spread out to other implementations than C git, so we're still good.
If we could write a flag in packed-refs saying "trust me" and other
implementations will strip it when they update packed-refs, then we're
good too.

>
> As a side note, while it is nice that we might make check_refname_format
> faster, I think if you _really_ want to make repos with a lot of refs
> faster, it would make more sense to introduce an on-disk format that
> does not need linear parsing (e.g., something we could mmap and binary
> search, or even something dbm-ish that could be updated without
> rewriting the whole file (deletions, for example, must rewrite the
> whole file, giving quadratic performance when deleting all refs one by
> one).

Yeah, I bring up the idea because I think Mike's multiple ref backends
is the way to go (assuming that it won't take as long as pack v4
development). If we assume we'll go with that, then we can keep the
workaround to minimum.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]