Re: Keeping unreachable objects in a separate pack instead of loose?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 11, 2012 at 06:14:39PM -0400, Ted Ts'o wrote:

> > Speaking of which, what is the mtime of the newly created cruft pack? Is
> > it the current mtime? Then those unreachable objects will stick for
> > another 2 weeks, instead of being back-dated to their pack's date. You
> > could back-date to the mtime of the most recent deleted pack, but that
> > would still prolong the life of objects from the older packs. It may be
> > acceptable to just ignore the issue, though; they will expire
> > eventually.
> 
> Well, we have that problem today when "git pack-objects
> --unpack-unreachable" explodes unreferenced objects --- they are
> written with the current mtime.

No, we don't; they get the mtime of the pack they are coming from (and
if the pack is older than pruneExpire, they are not exploded at all,
since they would just be pruned immediately anyway).

So an exploded object might have only a day or an hour to live after the
explosion, but with your strategy they always get two weeks.

> I assume you're worried about pre-existing loose objects that get
> collected up into a new cruft pack, since they would get the extra two
> weeks of life.  Given how much more efficient storing the cruft
> objects in a pack, I think ignoring the issue is what makes the most
> amount of sense, since it's a one-time extension, and the extra
> objects really won't do any harm.

I'm more specifically worried about large objects which are no better in
packs than they are in loose form (e.g., video files). This strategy is
a regression, since we are not saving space by putting them in a pack,
but we are keeping them around much longer. It also makes it harder to
just run "git prune" to get rid of large objects (since prune will never
kill off a pack), or to manually delete files from the object database.
You have to run "git gc --prune=now" instead, so it can make a new pack
and throw away the old bits (or run "git repack -ad").

> One last thought: if a sysadmin is really hard up for space, (and if
> the cruft objects include some really big sound or video files) one
> advantage of labelling the cruft packs explicitly is that someone who
> really needs the space could potentially find the oldest cruft files
> and delete them, since they would be tagged for easy findability.

No! That's exactly what I was worried about with the name. It is _not_
safe to do so. It's only safe after you have done a full repack to
rescue any non-cruft objects.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]