Re: Tag peeling peculiarities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[this email is from last week, and I think most of was responded to in
 other parts of the thread, but there were a few loose ends]

On Sat, Mar 16, 2013 at 02:38:12PM +0100, Michael Haggerty wrote:

> >> * Change pack-refs to use the peeled information from ref_entries if it
> >>   is available, rather than looking up the references again.
> > 
> > I don't know that it matters from a performance standpoint (we don't
> > really care how long pack-refs takes, as long as it is within reason,
> > because it is run as part of a gc).  But it would mean that any errors
> > in the file are propagated from one run of pack-refs to the next. I
> > don't know if we want to spend the extra time to be more robust.
> 
> I thought about this argument too.  Either way is OK with me.  BTW would
> it make sense to build a consistency check of peeled references into fsck?

Yeah, I do think an fsck check makes sense. It should not be expensive,
and if there is a realistic corruption/health check for a repo, it makes
sense to me for it to be part of fsck. I do not recall many incidents of
packed-refs corruption in the history of the list, but this fairly
recent one comes to mind:

  http://thread.gmane.org/gmane.comp.version-control.git/217065

On the other hand, if it is just as cheap to rewrite the file as it is
to do the health checks, I wonder if the advice should simply be "run
pack-refs again" (and doing so should always start from scratch, not
trusting the existing version).

> > Yuck. I really wonder if repack_without_ref should literally just be
> > writing out the exact same file, minus the lines for the deleted ref.
> > Is there a reason we need to do any processing at all? I guess the
> > answer is that you want to avoid re-parsing the current file, and just
> > write it out from memory. But don't we need to refresh the memory cache
> > from disk under the lock anyway? That was the pack-refs race that I
> > fixed recently.
> 
> It would certainly be thinkable to rewrite the file textually without
> parsing it again.  But I did it this way for two reasons:
> 
> 1. It would be better to have one packed-refs file parser and writer
> rather than two.  (I'm working towards unifying the two writers, and
> will continue once I understand their differences.)

I see your point, though I also feel that with the right refactoring,
they could share the parser. And the second writer would be mostly a
pass-through. But much more compelling is reason 2...

> 2. Given how peeled references were added, it seems dangerous to read,
> modify, and write data that might be in a future format that we don't
> entirely understand.  For example, suppose a new feature is added to
> mark Junio's favorite tags:
> 
>     # pack-refs with: peeled fully-peeled junios-favorites \n
>     ce432cac30f98b291be609a0fc974042a2156f55 refs/heads/master
>     83b3dd7151e7eb0e5ac62ee03aca7e6bccaa8d07 refs/tags/evil-dogs
>     ^e1d04f8aad59397cbd55e72bf8a1bd75606f5350
>     7ed863a85a6ce2c4ac4476848310b8f917ab41f9 refs/tags/lolcats
>     ^990f856a62a24bfd56bac1f5e4581381369e4ede
>     ^^^junios-favorite
>     b0517166ae2ad92f3b17638cbdee0f04b8170d99 refs/tags/nonsense
>     ^4baff50551545e2b6825973ec37bcaf03edb95fe

Hmm. Good point. It is tempting to make a rule like "extensions that
are specific to a ref must come after the ref but before the next ref".
And then the writer could simply drop any lines between the to-delete
ref and the one that follows it.

I think that would work OK in practice, but I am worried that it would
paint us into a corner later on (after all, we do not know what
extensions will be added in the future). The only thing I can think of
that would break is something where a ref or its extension depends on
previous entries.  E.g., we start prefix-compressing the ref names to
save space. Of course that would break backwards compatibility horribly
anyway, so it's a no-go. But maybe some extension would want to refer to
a prior ref. I dunno.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]