Re: sharing object packs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank You Shawn

On Thu, Jun 19, 2008 at 11:01 AM, Shawn O. Pearce <spearce@xxxxxxxxxxx> wrote:
> marc.zonzon+git@xxxxxxxxx wrote:
>> I have a big bare repository 'main.git' and many small git repositories sub1, sub2, ... subn.
>>
>> All repositories lie in the same file file system, and each subx
>> repository track and fetch main.git in a remote branch.
>>
>> I would like to avoid duplicating main.git objects
> ...
>> - Using an objects/info/alternates with the path of main.git object repository.
>> It work well too, but I import objects from main.git inside subx,
>> and they don't have the same life time than those in main.git. So
>> they can, disapear during a git-prune-packed or gc. (The same
>> problem we have with: git clone --share)
>
> This is the approach you want to use.  The risk is that you do
> not allow objects to be added to main.git to later be deleted from
> main.git.  This means main.git cannot rewind/reset/delete a branch.
>
> If that is not acceptable perhaps you could instead create 3 tiers:
>
>        main.git ---
>                    \
>                    shared.git
>                    /
>        subx.git ---
>
> Have main.git and subx.git both use shared.git as an alternate
> (place path of shared.git/objects in their objects/info/alternates).
> You can still allow subx.git to fetch main.git.
>

Your solution of 3 tiers seems to solve the problems I met when trying
to take main.git as alternates.

But I feel we can even make it more secure than what you explain:

> Push only stable commits to shared.git that will never be
> rewind/reset/deleted.  Once something enters shared.git it should
> never be deleted.  This way shared objects will not be removed by
> git-prune or git-gc.  Every so often push newer stable branches from
> main.git to shared.git, once they cannot be rewind/reset/deleted.
>

My option is to fetch from shared not only the branches of main, but
all the branches of all the subx.

So shared.git host all the objects of the sum of main and all subx.

Then there is any problem to reset, delete, or rewind a
branch in main, even if you fetch the resetted branch from shared (a
non fast-forward fetch), The objects of the deleted branch are either
not in any sub directory, and nothing is lost when they are
pruned, or they have been imported in some branch and they will be
kept, since there is a reference to them.


> Repack main.git and subx.git using `git gc` as that includes the
> -l flag to `git repack`.  Any objects which are now available from
> shared.git will not be included in main.git or subx.git, so their
> usage will shrink after shared.git is updated.
>

Yes I tested that, with git gc, I had no immediate shrinking, I suppose
we have to wait for gc.pruneExpire to see the result.

But:
 * fetching the remotes of shared.git,
 * packing shared.git,
 * packing and pruning (with git prune) the directories subx and
 main.git

reduces immediately the object store of the subx to nearly nothing.

> If you also configure gc.packrefs to never in shared.git and
> symlink shared.git/refs into main.git/refs/shared and also into
> subx.git/refs/shared and do this configuration on both server and
> client systems you can have everyone transfer only the minimal
> objects necessary.

Thank you also for this setting, my level of knowledge of git transfer
mechanism is yet too low for understanding it without further
explanation/reading. If you can give some pointers they are welcome.


This solution seems great to implement some kind of submodules.

I suppose we could also use this 3 tiers solution to do a more clever
clone --share by the following scheme:

# mkdir shared
# cd shared
# git init
# cd ..
# git clone --no-hardlinks --bare shared shared.git
# rm -rf shared
# cd shared.git
# git remote add -f repo ../repo
 * [new branch]      master     -> repo/master
# cd ..
# git clone repo repo_copy
# echo $PWD/shared.git/objects >> repo/.git/objects/alternates
# cd shared.git
# git remote add -f repo_copy ../repo_copy
 * [new branch]      master     -> repo_copy/master
# cd ..
# echo $PWD/shared.git/objects >> repo_copy/.git/objects/alternates

then the sequence of repack, gc, prune outlined above.

But I have not yet the experience in git, to allow me to foresee the
consequences of these settings.


All criticisms are welcome.

Marc
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux