Re: [PATCH] Introduce git-mirror, a tool for exactly mirroring another repository.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Shawn Pearce <spearce@xxxxxxxxxxx> writes:

> Sometimes its handy to be able to efficiently backup or mirror one
> Git repository to another Git repository by employing the native
> Git object transfer protocol.  But when mirroring or backing up a
> repository you really want:
>
>   1) Every object in the source to go to the mirror.
>   2) Every ref in the source to go to the mirror.
>   3) Any ref removed from the source to be removed from the mirror.
>   4) Automatically repack and prune the mirror when necessary.
>
> and since git-fetch doesn't do 2, 3, and 4 here's a tool that does.

Just a note.  I usually use git-push the other way for backups,
and I believe that is how Linus does it, too.

> diff --git a/git-mirror.perl b/git-mirror.perl
> new file mode 100755
> index 0000000..bff2003
> --- /dev/null
> +++ b/git-mirror.perl
> @@ -0,0 +1,111 @@
> +#!/usr/bin/env perl

Please don't.  "#!/usr/bin/env perl" is a disease.

> +# This file is licensed under the GPL v2, or a later version
> +# at the discretion of Linus.

Heh ;-).

> +use warnings;
> +use strict;
> +use Git;
> +
> +sub ls_refs ($$);

I wonder why people like line-noise prototypes.  Do you ever
call ls_refs with parameters that benefit from this?  Otherwise
I prefer not to see them.

> +my $remote = shift || 'origin';
> +my $repo = Git->repository();
> +
> +# Verify its OK to execute in this repository.
> +#
> +my $mirror_ok = $repo->config('mirror.allowed') || 0;
> +unless ($mirror_ok =~ /^(?:true|t|yes|y|1)$/i) {

This _is_ ugly.  Doesn't $repo->config() know how to drive
underlying "git-repo-config" with specific type argument?

> +# Delete any local refs which the server no longer contains.
> +#
> +foreach my $ref (keys %$local_refs) {
> +	next if $remote_refs->{$ref};
> +	print "removing $ref\n";
> +	my $log = "logs/$ref";
> +	unlink($repo->repo_path() . '/' . $ref);
> +	unlink($repo->repo_path() . '/' . $log);
> +	rmdir($repo->repo_path() . '/' . $ref) while $ref =~ s,/[^/]*$,,;
> +	rmdir($repo->repo_path() . '/' . $log) while $log =~ s,/[^/]*$,,;
> +}

If you do this upfront and then you lose connection while the
real fetch, next fetch may need to take a lot longer than needed
because it cannot rely on the refs you are losing here.  Ref
removal is rather a rare event, so we may not care too much
about it, though.

> +# Execute the fetch for any refs which differ from our own.
> +# We don't worry about trying to optimize for rewinds or
> +# exact branch copies as they are rather uncommon.

If we need to support only git-native protocols, all of this
optimization is not needed at all.  It's kind of sad that we
need to support commit walkers...

> +if (@to_fetch) {
> +	git_cmd_try {
> +		$repo->command_noisy('fetch',
> +			'--force',
> +			'--update-head-ok',
> +			$remote, sort @to_fetch);
> +	} '%s failed w/ code %d';

Why sort (no objection, just curious)?

> +# Repack if we have a large number of loose objects.
> +#
> +if (@to_fetch) {
> +	my $count_output = $repo->command('count-objects');
> +	my ($cur_loose) = ($count_output =~ /^(\d+) objects/);
> +	my $max_loose = $repo->config('mirror.maxlooseobjects') || 100;
> +	if ($cur_loose >= $max_loose) {
> +		git_cmd_try {
> +			$repo->command_noisy('repack', '-a', '-d');
> +			$repo->command_noisy('prune');
> +		} '%s failed w/ code %d';
> +	}
> +}

If we truly have a large number of objects (in pack and loose),
you do not want to do "repack -a -d", do you?

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]