Hi.
I've been dealing with a Subversion repository that contains a lot of
large binaries. Git generally seems to handle them reasonably enough,
although it chokes under the pressure of a 'git gc' with this git-svn
repository. The repository packs total 2.7 gigabytes. As it turns out,
the 250 individual blob revisions worth of large binaries are about 2.4
gigabytes of that.
Sometimes, 'git gc' runs out of memory. I have to discover which file
is causing the problem, so I can add it to .gitattributes with a
'-delta' flag. Mostly, though, the repacking takes forever, and I dread
running the operation.
As an experiment, I added a '-pack' flag to .gitattributes. This flag
will leave the file type specified by the .gitattributes entry loose in
the repository. During a 'git gc', instead of recopying gigabytes of
data each time, the loose objects are used. The 'git gc' process runs
very quick with this change.
The only issue I've found is in too_many_loose_objects(). gitk is
always telling me the repository needs to be packed, obviously because
of all the loose objects.
I haven't yet come up with a good idea for handling this. I thought
about putting the forced loose objects in a separate directory. (This
idea goes along with another that I want to build on top of this
functionality, the ability to commit and have -pack binaries go to an
alternates location.) I have also thought about writing out a file with
the count of forced loose objects and using that to drive the
guesstimate made by too_many_loose_objects() down.
Does anyone have any thoughts?
Thanks!
Josh
---
builtin/pack-objects.c | 25 +++++++++++++++++++++++++
1 files changed, 25 insertions(+), 0 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 214d7ef..f33a7fb 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -644,6 +644,28 @@ static int no_try_delta(const char *path)
return 0;
}
+static void setup_pack_attr_check(struct git_attr_check *check)
+{
+ static struct git_attr *attr_pack;
+
+ if (!attr_pack)
+ attr_pack = git_attr("pack");
+
+ check[0].attr = attr_pack;
+}
+
+static int must_pack(const char *path)
+{
+ struct git_attr_check check[1];
+
+ setup_pack_attr_check(check);
+ if (git_checkattr(path, ARRAY_SIZE(check), check))
+ return 1;
+ if (ATTR_FALSE(check->value))
+ return 0;
+ return 1;
+}
+
static int add_object_entry(const unsigned char *sha1, enum
object_type type,
const char *name, int exclude)
{
@@ -667,6 +689,9 @@ static int add_object_entry(const unsigned char
*sha1, enum object_type type,
if (!exclude && local && has_loose_object_nonlocal(sha1))
return 0;
+ if (name && !must_pack(name))
+ return 0;
+
for (p = packed_git; p; p = p->next) {
off_t offset = find_pack_entry_one(sha1, p);
if (offset) {
--
1.7.1.msysgit.3.1.g108b5.dirty
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html