Re: [PATCH v3] Prevent megablobs from gunking up git packs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 26 May 2007, Dana How wrote:

> I've been discussing these plans with IT here since they maintain
> everything else.
> They would like any part of the database that is going to be reorganized
> and replaced to be backed up first.  If only (1) is available,  and I
> repack every
> night,  then I need to back up the entire repository every night as well.

Why so?  The initial repack would create a set of packs where the last 
packs to be produced will contain large blobs that you don't have to 
ever repack.  Or maybe you produce large blobs every day and you want to 
prevent those from entering the pack up front?

> If I use (2) or (3),  then I back up just the repacked portion each night,
> back up the kept packs only when they are repacked (on a slower schedule),
> and/or back up the loose blobs on a similar schedule.
> 
> Besides this back up issue,  I simply don't want to have to repack _all_
> of such a large repository each night.  With (1), nightly repacks get longer
> and longer, and harder to schedule.
> 
> I think the minimum features needed to support (2) and (3) are the same:
> (a) An easy way to prevent loose blobs exceeding some size limit
>     from migrating into "nice" packs;
> (b) A way to prevent packed objects from being copied when
>     (i) they no longer meet the (new or reduced) size limit AND
>     (ii) they exist in some other safe form in the repository.
> The behavior of --max-blob-size=N in this patch provides both of these
> while deleting other behavior people didn't like.
> 
> You mentioned "incoherency" above;
> I'm not too sure how to proceed on that.
> If you have a more coherent way to provide (a) and (b) above,
> please let me know.

I think it boils down to a question of proper wordings.  Describing this 
as max-blob-size is misleading if in the end you still can end up with 
larger blobs in your pack.  I think there are two solutions to this 
incoherency: either the feature is called something else to reflect the 
fact that it concerns itself only with migration of loose blobs into the 
packed space (I cannot come up with a good name though), or the whole 
pack-objects process is aborted with an error whenever the max-blob-size 
condition cannot be satisfied due to large blobs existing in packed form 
only indicating that a separate extraction of large blobs process is 
required.


Nicolas
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux