[git subtree] documentation and new --squash mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I've finally had some time to write an asciidoc man page for git
subtree, which is included below for your convenience.

I added a new "--squash" option to the subtree merge command, which
lets you do cool stuff like switch your subtree branch back and forth
between upstream versions without actually losing any of your local
customizations.  It also avoids the problem of polluting your
project's history: it pulls in only individual subproject trees (by
creating a synthetic commit for the desired tree, with its parent
being the *previous* synthetic commit).  This saves repository size
(you don't slurp in all the intermediate trees), plus you don't get
all the commit messages from the subproject, which are probably
irrelevant to you anyway.

As always, comments welcome.  Quite a few people have emailed me about
git subtree lately; I think the --squash mode will probably resolve
the problems for almost everyone who had them.  Assuming it works
correctly, which is of course 100% guaranteed or your money back.

You can always find the latest git-subtree at:
git://github.com/apenwarr/git-subtree

Docs below.  Check out the EXAMPLES section for a quick start.

Have fun,

Avery


git-subtree(1)
==============

NAME
----
git-subtree - add, merge, and split subprojects stored in subtrees


SYNOPSIS
--------
[verse]
'git subtree' add   --prefix=<prefix> <commit>
'git subtree' merge --prefix=<prefix> <commit>
'git subtree' pull  --prefix=<prefix> <repository> <refspec...>
'git subtree' split --prefix=<prefix> <commit...>


DESCRIPTION
-----------
git subtree allows you to include an subproject in your
own repository as a subdirectory, optionally including the
subproject's entire history. For example, you could
include the source code for a library as a subdirectory of your
application.

You can also extract the entire history of a subdirectory from
your project and make it into a standalone project.  For
example, if a library you made for one application ends up being
useful elsewhere, you can extract its entire history and publish
that as its own git repository, without accidentally
intermingling the history of your application project.

Most importantly, you can alternate back and forth between these
two operations.  If the standalone library gets updated, you can
automatically merge the changes into your project; if you
update the library inside your project, you can "split" the
changes back out again and merge them back into the library
project.

Unlike the 'git submodule' command, git subtree doesn't produce
any special constructions (like .gitmodule files or gitlinks) in
your repository, and doesn't require end-users of your
repository to do anything special or to understand how subtrees
work.  A subtree is just another subdirectory and can be
committed to, branched, and merged along with your project in
any way you want.

In order to keep your commit messages clean, we recommend that
people split their commits between the subtrees and the main
project as much as possible.  That is, if you make a change that
affects both the library and the main application, commit it in
two pieces.  That way, when you split the library commits out
later, their descriptions will still make sense.  But if this
isn't important to you, it's not *necessary*.  git subtree will
simply leave out the non-library-related parts of the commit
when it splits it out into the subproject later.


COMMANDS
--------
add::
	Create the <prefix> subtree by importing its contents
	from the given commit.  A new commit is created
	automatically, joining the imported project's history
	with your own.  With '--squash', imports only a single
	commit from the subproject, rather than its entire
	history.

merge::
	Merge recent changes up to <commit> into the <prefix>
	subtree.  As with normal 'git merge', this doesn't
	remove your own local changes; it just merges those
	changes into the latest <commit>.  With '--squash',
	creates only one commit that contains all the changes,
	rather than merging in the entire history.

	If you use '--squash', the merge direction doesn't
	always have to be forward; you can use this command to
	go back in time from v2.5 to v2.4, for example.  If your
	merge introduces a conflict, you can resolve it in the
	usual ways.
	
pull::
	Exactly like 'merge', but parallels 'git pull' in that
	it fetches the given commit from the specified remote
	repository.

split::
	Extract a new, synthetic project history from the
	history of the <prefix> subtree.  The new history
	includes only the commits (including merges) that
	affected <prefix>, and each of those commits now has the
	contents of <prefix> at the root of the project instead
	of in a subdirectory.  Thus, the newly created history
	is suitable for export as a separate git repository.
	
	After splitting successfully, a single commit id is
	printed to stdout.  This corresponds to the HEAD of the
	newly created tree, which you can manipulate however you
	want.
	
	Repeated splits of exactly the same history are
	guaranteed to be identical (ie. to produce the same
	commit ids).  Because of this, if you add new commits
	and then re-split, the new commits will be attached as
	commits on top of the history you generated last time,
	so 'git merge' and friends will work as expected.
	
	Note that if you use '--squash' when you merge, you
	should usually not just '--rejoin' when you split.


OPTIONS
-------
-q::
--quiet::
	Suppress unnecessary output messages on stderr.

-d::
--debug::
	Produce even more unnecessary output messages on stderr.

--prefix=<prefix>::
	Specify the path in the repository to the subtree you
	want to manipulate.  This option is currently mandatory
	for all commands.


OPTIONS FOR add, merge, AND pull
--------------------------------
--squash::
	Instead of merging the entire history from the subtree
	project, produce only a single commit that contains all
	the differences you want to merge, and then merge that
	new commit into your project.

	Using this option helps to reduce log clutter. People
	rarely want to see every change that happened between
	v1.0 and v1.1 of the library they're using, since none of the
	interim versions were ever included in their application.
	
	Using '--squash' also helps avoid problems when the same
	subproject is included multiple times in the same
	project, or is removed and then re-added.  In such a
	case, it doesn't make sense to combine the histories
	anyway, since it's unclear which part of the history
	belongs to which subtree.
	
	Furthermore, with '--squash', you can switch back and
	forth between different versions of a subtree, rather
	than strictly forward.  'git subtree merge --squash'
	always adjusts the subtree to match the exactly
	specified commit, even if getting to that commit would
	require undoing some changes that were added earlier.
	
	Whether or not you use '--squash', changes made in your
	local repository remain intact and can be later split
	and send upstream to the subproject.


OPTIONS FOR split
-----------------
--annotate=<annotation>::
	When generating synthetic history, add <annotation> as a
	prefix to each commit message.  Since we're creating new
	commits with the same commit message, but possibly
	different content, from the original commits, this can help
	to differentiate them and avoid confusion.
	
	Whenever you split, you need to use the same
	<annotation>, or else you don't have a guarantee that
	the new re-created history will be identical to the old
	one.  That will prevent merging from working correctly.
	git subtree tries to make it work anyway, particularly
	if you use --rejoin, but it may not always be effective.

-b <branch>::
--branch=<branch>::
	After generating the synthetic history, create a new
	branch called <branch> that contains the new history.
	This is suitable for immediate pushing upstream.
	<branch> must not already exist.

--ignore-joins::
	If you use '--rejoin', git subtree attempts to optimize
	its history reconstruction to generate only the new
	commits since the last '--rejoin'.  '--ignore-join'
	disables this behaviour, forcing it to regenerate the
	entire history.  In a large project, this can take a
	long time.

--onto=<onto>::
	If your subtree was originally imported using something
	other than git subtree, its history may not match what
	git subtree is expecting.  In that case, you can specify
	the commit id <onto> that corresponds to the first
	revision of the subproject's history that was imported
	into your project, and git subtree will attempt to build
	its history from there.
	
	If you used 'git subtree add', you should never need
	this option.

--rejoin::
	After splitting, merge the newly created synthetic
	history back into your main project.  That way, future
	splits can search only the part of history that has
	been added since the most recent --rejoin.
	
	If your split commits end up merged into the upstream
	subproject, and then you want to get the latest upstream
	version, this will allow git's merge algorithm to more
	intelligently avoid conflicts (since it knows these
	synthetic commits are already part of the upstream
	repository).
	
	Unfortunately, using this option results in 'git log'
	showing an extra copy of every new commit that was
	created (the original, and the synthetic one).
	
	If you do all your merges with '--squash', don't use
	'--rejoin' when you split, because you don't want the
	subproject's history to be part of your project anyway.


EXAMPLES
--------
Let's use the repository for the git source code as an example.
First, get your own copy of the git.git repository:

	$ git clone git://git.kernel.org/pub/scm/git/git.git test-git
	$ cd test-git

gitweb (commit 1130ef3) was merged into git as of commit
0a8f4f0, after which it was no longer maintained separately.
But imagine it had been maintained separately, and we wanted to
extract git's changes to gitweb since that time, to share with
the upstream.  You could do this:

	$ git subtree split --prefix=gitweb --annotate='(split) ' \
        	0a8f4f0^.. --onto=1130ef3 --rejoin \
        	--branch gitweb-latest
        $ gitk gitweb-latest
        $ git push git@xxxxxxxxxx:whatever/gitweb gitweb-latest:master

(We use '0a8f4f0^..' because that means "all the changes from
0a8f4f0 to the current version, including 0a8f4f0 itself.")

If gitweb had originally been merged using 'git subtree add' (or
a previous split had already been done with --rejoin specified)
then you can do all your splits without having to remember any
weird commit ids:

	$ git subtree split --prefix=gitweb --annotate='(split) ' --rejoin \
		--branch gitweb-latest2

And you can merge changes back in from the upstream project just
as easily:

	$ git subtree pull --prefix=gitweb \
		git@xxxxxxxxxx:whatever/gitweb gitweb-latest:master

Or, using '--squash', you can actually rewind to an earlier
version of gitweb:

	$ git subtree merge --prefix=gitweb --squash gitweb-latest~10

Then make some changes:

	$ date >gitweb/myfile
	$ git add gitweb/myfile
	$ git commit -m 'created myfile'

And fast forward again:

	$ git subtree merge --prefix=gitweb --squash gitweb-latest

And notice that your change is still intact:
	
	$ ls -l gitweb/myfile

And you can split it out and look at your changes versus
the standard gitweb:

	git log gitweb-latest..$(git subtree split --prefix=gitweb)


AUTHOR
------
Written by Avery Pennarun <apenwarr@xxxxxxxxx>


GIT
---
Part of the linkgit:git[1] suite
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]