Re: [PATCH 01/17] Documentation/technical: add cruft-packs.txt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Dec 04, 2021 at 02:20:23PM -0800, Elijah Newren wrote:
> > +== Cruft packs
> > +
> > +Cruft packs are designed to eliminate the need for storing unreachable objects
> > +in a loose state by including the per-object mtimes in a separate file alongside
> > +a single pack containing all loose objects.
>
> I had the same question as Stolee here: why not use the cruft-pack's
> mtime for all the objects in it?  Much later below, you make it clear
> that a repository will generally only have one cruft pack which kind
> of answers the question, but the repeated mention of "cruft packs"
> throughout the document subtly made me make the opposite assumption.
> It might be nice to address the almost-always-only-one-cruft-pack
> earlier on, which may also help answer the question about why you need
> to store individual mtimes in an additional file.

Responding to your suggestions out of order ;-). Throughout the
document, I wrote "cruft packs" in the sense of "the feature this series
implements", not "multiple cruft packs".

But my wording is unintentionally vague, especially because this
document does talk about why this series stores unreachable objects in a
single cruft pack. I updated my copy to make clear the difference
between the two, which should hopefully avoid any confusion here in the
future.

As far as why not use the cruft pack's timestamp as the mtime for all of
the unreachable objects contained within it, there are a few reasons:

It makes freshening objects more complicated. Not because we couldn't
freshen individual objects (we would likely do so in the same way this
series does, by rewriting it loose and using the loose copy's mtime
instead), but because it makes it complicated to repack a repository
with many cruft packs. If I have a handful of cruft packs, and freshen a
handful of objects within them, I now need to update many cruft packs,
or pay the price of storing their objects twice (if I instead don't
rewrite them and keep the loose copies around).

It also makes it impossible to share deltas between cruft objects that
don't have the same timestamp, unless the cruft packs are stored thin
(in which case it becomes much more complicated to figure out which
cruft packs can be safely pruned without storing information about which
other packs a thin pack has deltas against).

I'm sure there were others, but these are the ones that I could recall
off the top of my head. This all felt like a little too much detail for
the "alternative designs" section, but if you think some or all of this
would be interesting to memorialize not just on the mailing list, let me
know.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux