Re: [PATCH] diff-delta: produce optimal pack data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 24, 2006 at 04:12:14PM -0500, Nicolas Pitre wrote:
> On Fri, 24 Feb 2006, Carl Baldwin wrote:
> > After the twelve large objects were packed into individual packs the
> > rest of the packing went very quickly and git v1.2.3's date reuse worked
> > very well.  This was sort of my attempt at simulating how things would
> > be if git avoided deltification of each of these large files. I'm sorry
> > to have been so harsh earlier I just didn't understand that
> > incrementally packing one-by-one was going to help this much.
> 
> Hmmmmmmm....
> 
> I don't think I understand what is going on here.
> 
> You say that, if you add those big files and incrementally repack after 
> each commit using git repack with no option, then it requires only about 
> one second each time.  Right?

Well, actually I was packing them individually by calling
git-pack-objects directly on each blob.

I'll try doing it exactly as you describe...

Ok, I tried it.  Basically I do the following.

% mkdir test
% cd test
% git init-db
% cp ../files/binfile.1 binfile
% time git add binfile

real    0m2.459s
user    0m2.443s
sys     0m0.019s
% git commit -a -m "Rev 1"
% time git repack
[snip]

real    0m1.111s
user    0m1.046s
sys     0m0.061s
% for i in $(seq 2 12); do
    cp ../files/binfile.$i binfile
    time git commit -a -m "Rev $i"
    time git repack
done

Each commit takes around 2.8-3.5 seconds and each repack takes about
1.2-1.5 seconds.  These are prettly reasonable.

Now, I try 'git repack -a -f' (or even without -f) and it goes out to
lunch.  I think it would take on the order of a day to actually finish
because it wasn't very far after an hour.

[snip]
> How many files besides those 12 big blobs do you have?

This repository has been completely stripped to the 12 revisions of the
one file.  So, there are 36 objects.

12 blobs.
12 trees.
12 commits.

That is all.

[snip]
> If you can add them into a single .tgz with instructions on how 
> to reproduce the issue and provide me with an URL where I can fetch it 
> that'd be perfect.

I will do this in an email off of the list because these files really
shouldn't be available on a public list.

Carl

-- 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 Carl Baldwin                        RADCAD (R&D CAD)
 Hewlett Packard Company
 MS 88                               work: 970 898-1523
 3404 E. Harmony Rd.                 work: Carl.N.Baldwin@xxxxxx
 Fort Collins, CO 80525              home: Carl@xxxxxxxxxxxxx
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]