[RFC] Perl script git-squash-merge to combine merge commits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have written a small Perl script, included below under the same terms as Perl itself, that combines a group of stepwise 2-input merge commits into a single N-input merge commit, with the long explanation in the included POD.

-- Jacob

--- 8< ---

#!/usr/bin/perl
# -*- CPerl -*-

use strict;
use warnings;

use File::Spec;
use Getopt::Long;
use Pod::Usage;

use IPC::Open2;

my $Help = 0;
my $Verbose = 0;

GetOptions('help'	=> \$Help,
	   'verbose|v'	=> \$Verbose) or pod2usage(2);
pod2usage(1) if $Help;

my $GIT_DIR = `git rev-parse --git-dir`; chomp $GIT_DIR;
die "could not find Git directory" unless $GIT_DIR;

# read a commit; return hashref with keys:
#  id		-- Git commit id
#  tree		-- tree object id for this commit
#  parents	-- array of parents of this commit
#  author	-- commit author line (name, timestamp)
#  committer	-- commit committer line (name, timestamp)
#  files	-- filenames mentioned in commit diff
sub read_commit ($) {
 my $commit_id = shift;

 my $commit = {};

 open my $git_show, '-|', qw/git show --format=raw/, $commit_id
   or die "git show $commit_id: $!";
 while (<$git_show>) {
   chomp;
   my @fields = split /[[:space:]]+/, $_;
   if (@fields) {
     if ($fields[0] eq 'parent') {
	push @{$commit->{parents}}, $fields[1];
     } elsif ($fields[0] eq 'diff' && $fields[1] eq '--cc') {
	push @{$commit->{files}}, $fields[2];
     } elsif ($fields[0] eq 'tree') {
	die "extra tree in commit $commit_id" if $commit->{tree};
	$commit->{tree} = $fields[1];
     } elsif ($fields[0] eq 'commit') {
	die "extra commit id in commit $commit_id" if $commit->{id};
	$commit->{id} = $fields[1];
     }
   }
 }
 close $git_show;

 return $commit
}

# commits indexed by id
my %Commits = ();

sub intern_commit ($) {
 my $id = shift;
 return $Commits{$id} if $Commits{$id};
 my $commit = read_commit $id;
 $Commits{$id} = $Commits{$commit->{id}} = $commit;
 return $commit;
}

# branch names indexed by id
my %Branches = ();
# [id, branch name] pairs
my @Branches = ();

#
# Read available branches
#
{
 open my $git_refs, '-|', qw/git show-ref --heads/;
 while (<$git_refs>) {
   chomp;
   if (m[^([[:xdigit:]]{40,})[[:space:]]+refs/heads/(.*)$])
     { push @Branches, [$1, $2] }
   else { die "could not parse ref line: $_" }
 }
 close $git_refs;

 foreach my $cell (@Branches)
   { $Branches{$cell->[0]} = $cell->[1]
	unless $cell->[1] =~ m/^local/ && $Branches{$cell->[0]} }
}

# tip of merge tree (set once at runtime)
my $STARTING_COMMIT = intern_commit('HEAD');

# tree object id of merged tree (set once at runtime)
my $FINAL_TREE = $STARTING_COMMIT->{tree};

# array of branches involved in merge tree
my @Merged_Branches = ();

# hash of conflicts resolved in merge tree; file name => count
my %Conflicts_Resolved = ();

#
# Read needed commits
#
{
 # array of commit ids to read
 my @Queue = ($STARTING_COMMIT->{id});

 while (@Queue) {
   my $id = shift @Queue;
   my $commit = intern_commit $id;

   if ($Verbose) {
     print "read commit $commit->{id}";
   }

   if (scalar @{$commit->{parents}} == 2) {
     # commit is a simple 2-input merge
     if ($Verbose) {
	print "; merge; resolving conflicts:\n";
	print "    $_\n" for @{$commit->{files}};
     }
     unshift @Merged_Branches, grep $Branches{$_}, @{$commit->{parents}};
     push @Queue, grep !$Branches{$_}, @{$commit->{parents}};
   } else {
     # commit was an input to the merge tree, but not a branch tip
     if ($Verbose) {
	print "; surprise merge input\n";
     }
     unshift @Merged_Branches, $commit->{id};
   }

   $Conflicts_Resolved{$_}++ for @{$commit->{files}};
 }
}

#
# Prepare new commit message
#
my $Commit_Message_File = File::Spec->catfile($GIT_DIR, 'COMMIT_EDITMSG');
open EDITMSG, '>', $Commit_Message_File
 or die "cannot open $Commit_Message_File for write: $!";

print EDITMSG 'Merge ',(scalar @Merged_Branches), " branches\n\n";
for my $tip_id (@Merged_Branches) {
 print EDITMSG '* ', $Branches{$tip_id}, ":\n",
   '   merged at commit ', $tip_id, "\n";
}

print EDITMSG <<'END';

# Please edit the commit message for the combined merge commit.  Lines
# starting with '#' will be ignored, along with initial or trailing blank
# lines.  Lines containing only whitespace will become empty.  An empty
# message aborts the commit.
END

close EDITMSG;

#
# Give the user a change to edit the message or abort
#
{
 my $EDITOR = `git var GIT_EDITOR`; chomp $EDITOR;

 system "$EDITOR $Commit_Message_File";

 open EDITMSG, '<', $Commit_Message_File
   or die "cannot open $Commit_Message_File for read: $!";
 my $linecnt = 0;
 while (<EDITMSG>)
   { next if m/^#/ || m/^[[:space:]]*$/; $linecnt++ }
 close EDITMSG;

 if ($linecnt == 0) {
   print "commit aborted due to empty message\n";
   exit 0;
 }
}

#
# Write the new commit
#

# new commit id
my $Merge_Commit_ID = undef;

{
 my $get_commit_id; my $put_commit_msg;

 open my $commit_msg, '<', $Commit_Message_File
   or die "cannot open $Commit_Message_File for read: $!";
 my $collapse_blanks = 1; my $prev_blank = '';
 my $pid =
   open2($get_commit_id, $put_commit_msg,
	  qw/git commit-tree/, $FINAL_TREE, map {+'-p' => $_} @Merged_Branches);
 while (<$commit_msg>) {
   next if m/^#/;
   next if $collapse_blanks && m/^[[:space:]]*$/;
   if ($prev_blank) { print $put_commit_msg $prev_blank; $prev_blank = '' }
   $prev_blank = "\n" if s/^[[:space:]]*$//;
   print $put_commit_msg $_;
   $collapse_blanks = m/^[[:space:]]*$/;
 }
 close $commit_msg;
 close $put_commit_msg;

 $Merge_Commit_ID = <$get_commit_id>; chomp $Merge_Commit_ID;
 die "could not get merge commit ID" unless $Merge_Commit_ID;
}

#
# Fail if Git reports conflicts in the combined merge that had not been
# resolved in the previous merges.
#
{
 my $merge_commit = read_commit $Merge_Commit_ID;
 my @new_conflicts = ();

 foreach my $file (@{$merge_commit->{files}}) {
   push @new_conflicts, $file unless $Conflicts_Resolved{$file};
 }

 if (@new_conflicts) {
   print <<'END';
ERROR:  False conflict introduced in combined merge!

The following files were not listed as conflicts in any input merge commit,
but Git reports a conflict resolved in the combined merge:
END

   print '    ', $_, "\n" for sort @new_conflicts;

   print "\ncombined merge failed with commit $Merge_Commit_ID\n";
 } else {
   # The combined merge looks good.  Save it.
   system {'git'} qw/git update-ref HEAD/, $Merge_Commit_ID;
 }
}

__END__

=head1 NAME

   git-squash-merge - combine merges into one merge commit

=head1 SYNOPSIS

   git-squash-merge [options]

=head1 DESCRIPTION

Rewrites history to collect a combined merge after a series of manual
merges.  This command does B<not> produce a new tree or change the contents
of the tree in any way; it only creates an octopus merge commmit to carry
the results of a group of 2-input merges.

Assume that development has diverged in several directions, all of which
merge readily, but administrivia such as ChangeLog entries or Emacs
LocalWords lists in documentation present merge conflicts and prevent a
direct octopus merge.

    B---D---F topic1
   /
  +---C---E topic2
 /
A master
 \
  +---H---K topic3
   \
    G---I---J topic4

All of these branches should be merged into master, but a direct octopus
merge fails because they all have different ChangeLog updates.  So they
must be merged one at a time, resolving the conflicts manually along the
way.

    B---D----F topic1
   /	       \
  +---C---E    \ topic2
 /	    \    \
A           L---MN master
 \	    /   /
  +---H---K   / topic3
   \	      /
    G---I---J topic4

Now master has a correct merged tree (and testing has validated that the
merged result is good) but history is a bit of a mess.  This tool replaces
the merge commits LMN with single octopus merge O using the tree object
from commit N.

    B---D---F topic1
   /	      \
  +---C---E---+ topic2
 /		\
A               O master
 \		/
  +---H---K---+ topic3
   \	      /
    G---I---J topic4

Because the tree from commit N is reused for commit O, commit O contains
the end product of the manual stepwise merge reconciliation, but is a
single commit with all four branches as its parents.

=head1 OPTIONS

=over

=item B<-v>, B<--verbose>

Emit messages as the merge tree is traced.

=back

=head1 SEE ALSO

git-merge(1)

git-commit-tree(1), git-rev-parse(1), git-show(1),
git-show-ref(1), git-update-ref(1), git-var(1)

=head1 AUTHOR

Written by Jacob Bachmeyer <jcb62281+dev@xxxxxxxxx>

=head1 COPYRIGHT AND LICENSE

Copyright (C) 2020 by Jacob Bachmeyer

This tool is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux