Re: Disk waste with packs and .keep files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 10, 2014 at 10:21:03AM +0200, Matthieu Moy wrote:

> Since a few weeks however, Git started wasting my disk space: instead of
> creating small .pack files next to the big .keep-ed pack files, it seems
> to create redundant, big .pack files (i.e. I get N pack files of similar
> size). "git verify-pack" confirms that, for example, the object
> corresponding to the root commit is contained in each of the .pack file.
>
> I don't have a reproducible way to get the situation so I didn't bisect,
> but "git log --grep .keep" points me to this which seems related:
> 
>   commit ee34a2beadb94a9595f09af719e3c09b485ca797
>   Author: Jeff King <peff@xxxxxxxx>
>   Date:   Mon Mar 3 15:04:20 2014 -0500
> 
>     repack: add `repack.packKeptObjects` config var

Eek. Does anybody have a brown paper bag I can borrow?

-- >8 --
Subject: repack: do not accidentally pack kept objects by default

Commit ee34a2b (repack: add `repack.packKeptObjects` config
var, 2014-03-03) added a flag which could duplicate kept
objects, but did not mean to turn it on by default. Instead,
the option is tied by default to the decision to write
bitmaps, like:

  if (pack_kept_objects < 0)
	  pack_kept_objects = write_bitmap;

after which we expect pack_kept_objects to be a boolean 0 or
1.  However, that assignment neglects that write_bitmap is
_also_ a tri-state with "-1" as the default, and with
neither option given, we accidentally turn the option on.

This patch is the minimal fix to restore the desired
behavior for the default state. However, the real fix will
be more involved.

The decision to turn on bitmaps via config is actually made
in pack-objects itself (which is why we need write_bitmap as
a tri-state here; we only pass the override option if the
user gave us a command-line option). To tie the options
together correctly, we need to either pass the "don't know"
tristate down to pack-objects (which would also read
repack.packKeptObjects), or pull the reading of
pack.writebitmaps up to the repack level.

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
I think the latter makes the most sense, and it was a mistake to read
the option in pack-objects in the first place. We _never_ want to
write bitmaps when packing to stdout, or even when doing a non-complete
repack. We had to teach pack-objects special logic to turn bitmaps off
in that case, but the right solution instead is that pack-objects should
always respect the --write-bitmap-index flag on the command line, and
the callers should drive that decision (and really only "repack -[aA]"
would want to use it). And then the fix here will just come out
naturally from that.

I'll work up a series, but we may want to fast-track this patch for
maint. It's a fairly big regression in v2.0. We didn't notice because
it's only an optimization issue, not a correctness one, and I guess not
that many people use .keep packs.

 builtin/repack.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index 6b0b62d..17bc8da 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -191,7 +191,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 				git_repack_usage, 0);
 
 	if (pack_kept_objects < 0)
-		pack_kept_objects = write_bitmap;
+		pack_kept_objects = write_bitmap > 0;
 
 	packdir = mkpathdup("%s/pack", get_object_directory());
 	packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid());
-- 
2.0.0.729.g520999f

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]