On Thu, Sep 11, 2008 at 05:39:45PM +0200, Thomas Rast wrote: > This attempts to make a manpage about workflows that is both handy to > point people at it and as a beginner's introduction. Thank you for your attempt. It is clearly a missing part of the Git documentation. I have a few comments to it below. > +SEPARATE CHANGES > +---------------- > + > +As a general rule, you should try to split your changes into small > +logical steps, and commit each of them. They should be consistent, > +working independently of any later commits, pass the test suite, etc. I would rather add some explanation why it is a good idea. Something like this: "This makes the review process much easier, as well as, makes git bisect much more useful in finding the cause of regressions." > + > +To achieve this, try to commit your new work at least every couple > +hours. You can always go back and edit the commits with `git rebase > +--interactive` to further improve the history before you publish it. I like the idea of this paragraph but not its wording. Maybe this will be better (just a variant): "To achieve this, try to split your work in small steps from the very beginning. It is always easier to squash a few commits together than splitting one big commit into a few. Don't be afraid making steps too small or that they are not perfect yet. You can always go back later and edit the commits with `git rebase --interactive` before you publish it." > + > +MANAGING BRANCHES > +----------------- > + > +In the following, we will assume there are 'developers', 'testers' and > +'users'. Even if the "Testers" are actually an automated test suite > +and all "Users" are developers themselves, try to think in these terms > +as you follow a software change through its life cycle. > + > +Usually a change evolves in a few steps: > + > +* The developers implement a few iterations until it "seems to work". > + > +* The testers play with it, report bugs, test the fixes, eventually > + clearing the change for stable releases. Perhaps, the above two points are the most controversial in my opinion. First, I would expect developers to implement a few iterations until it (their work) passes the automated test suite and peer review. Only then their work is merged into 'next' (or into a similar branch, which constitute that this series is published now). Second, I am not sure what you meant by testers clears changes for stable releases, especially after you stated "Testers" may be an automated test suite. Whether some change is included is always a conscious decision of the project maintainer. The fact that some change has passed all tests successfully only clears it for including into 'next'. > + > +* As the users work with the new feature, they report bugs which will > + have to be fixed. > + > +In the following sections we discuss some problems that arise from > +such a "change flow", and how to solve them with Git. > + > +We consider a fictional project with (supported) stable branch > +'maint', main testing/development branch 'master' and "bleeding edge" > +branch 'next'. We collectively call these three branches 'main > +branches'. The idea of 'next' is not obvious from your above explanation. When I started to learn how Git workflow works, I read something like above and was very puzzled what is the purpose of having two development branches: 'master' and 'next'. Only later I realized that it is necessary to give flexibility in making decisions of what should be included in the next stable release and what may need more "cooking" to prove their reliability and usefulness. > + > + > +Merging upwards > +~~~~~~~~~~~~~~~ > + > +Since Git is quite good at merges, one should try to use them to > +propagate changes. For example, if a bug is fixed, you would want to > +apply the corresponding fix to all main branches. The first and second sentences are a bit disconnected here. I would rather write the second one like this: "An example of such a change can be a bug fix, which should be applied to all main branches." Another thing is that I am not sure that the provided reason for doing so ("Git is quite good at merges") is good enough. It can be said that Git is quite good at cherry-picking too. Yet, we use merge, because it allows to deal with large number of patches easier. Merge can be easily visualized and understood as every merge point means that all changes before it are included. However, to being able use merge, the developer has to start from the oldest branch that will include this change. This is a clear restriction over the anarchic nature of cherry-picking (where you can introduce a change to an arbitrary branch and then cherry-pick to others), but it pays off in the long run by better maintainability of the project. Thus the recommended practice is a strong preference to use merge over cherry-picking. It does not mean that cherry-picking should be completely excluded. Occasionally, it may be useful. > + > +A quick moment of thought reveals that you cannot do this by merging > +"downwards" to older releases, since that would merge 'all' changes. IMHO, expressions such as "a quick moment of thought reveals..." is more suitable for blogs than for serious documentation. > +Hence the following: > + > +.Merge upwards > +[caption="Rule: "] > +===================================== > +Always commit your fixes to the oldest supported branch that require > +them. Then (periodically) merge the main branches upwards into each > +other. > +===================================== Perhaps, it is worth to note here that a non-trivial fixes can be implemented as topic branches, which starts from the oldest branch that needs them. > + > +This gives a very controlled flow of fixes. If you notice that you > +have applied a fix to e.g. 'master' that is also required in 'maint', > +you will need to cherry-pick it (using linkgit:git-cherry-pick[1]) > +downwards. This will happen a few times and is nothing to worry about > +unless you do it all the time. > + > + > +Topic branches > +~~~~~~~~~~~~~~ > + > +Any nontrivial feature will require several patches to implement, and > +may get extra bugfixes or improvements during its lifetime. If all > +such commits were in one long linear history chain (e.g. if they were > +all committed directly to, 'master'), it becomes very hard to see how > +they belong together. There is a far more important reason to use topic branches than ecstatic pleasure from being able to see related changes grouped together in the history. The main reason to use topic branches is to facilitate parallel development. Though the idea that anyone commit to the main development branch ('master') is very appealing due to its simplicity, it leads to problems down the road. Namely, not all good sounding ideas turns out good in reality. In the workflow where everyone commits to 'master' there are only two ways to deal with that. The first approach is not let developers to commit their changes until they have completely finished their work and passed all tests and code-review, and their work deemed important enough to be included in the next feature release. The second approach is to commit their work in progress in the hope that it will succeed, and if not then to rollback changes. Neither of these two approaches is satisfactory, especially for large projects. The first approach means that developers are under a great stress due to inability to save their work in progress, they accumulate a huge patch, which is very difficult to review, often include some other changes unrelated to the stated goal, and makes the history of the project nearly useless for bisecting (linkgit:git-bisect[1]) when it comes to finding a regression. The second approach means that the project history gets contaminating with a great number of changes that eventually didn't work out. Moreover, reverting changes that are belong to some failed work may extremely difficult as other changes intervene with them. So, this reverting is hardly ever done completely in practice if it is done at all, which leads to a lot of garbage in the source code. Obviously, the history of this project is completely useless for bisecting as many commits do not really work if they are compiled at all. Also, this approach leads to an extremely long stabilization period as it is determined by the time when slowest going work will be in good shape for release. Using topic branches immune to that problem as feature are included into 'master' when they are ready. Moreover, feature branches unless they are "publish" can go through cycles of testing, review, and interactive rebasing to edit and improve individual commits. Thus the finally published history is clean and easy to bisect. > + > +Roughly speaking, there are two important workflows. I think it would make sense to name them here. > Their > +distinguishing mark is whether they can be used to propagate merges. Perhaps, it would be better to say: "They are distinguished by the ability to propagate merges." However, this is not the only distinguish between them. Besides, I am not sure how this one is connected with the rest of the paragraph: > +Medium to large projects will typically employ some mixture of the > +two: > + > +* "Upstream" in the most general sense 'pushes' changes to the > + repositor(ies) holding the main history. IMHO, it would be better: s/the main history/the official history of the project/ > Everyone can 'pull' from there to stay up to date. Would that entrench the wrong idea that one needs to do 'pull' habitually? And the habitual 'pull' results in habitual 'merge'. > + > +* Frequent contributors, subsystem maintainers, etc. may use push/pull > + to send their changes upstream. This is nitpicking, but you cannot use 'pull' to send changes. However, I suppose you meant to make your repository available for other people to pull from it. > + > +* The rest -- typically anyone more than one or two levels away from the > + main maintainer -- send patches by mail. After reading "mixture of the two:" above, I expected these two being named, but instead I can see three points. So, it is confusing. > +If the maintainer tells you that your patch no longer applies to the > +current upstream, you will have to rebase your topic (you cannot use a > +merge because you cannot format-patch merges): > + > +.format-patch/am: Keeping topics up to date > +[caption="Recipe: "] > +===================================== > +`git rebase upstream` > +===================================== Maybe, git pull --rebase is better advice here as it will also fetch the latest changes from the upstream. > + > +You can then fix the conflicts during the rebase. Presumably you have > +not published your topic other than by mail, so rebasing it is not a > +problem. > + > +If you receive such a patch (as maintainer, or perhaps reader of the > +mailing list it was sent to), save the mail to a file and use > +'git-am': > + > +.format-patch/am: Publishing branches/topics > +[caption="Recipe: "] > +===================================== > +`git am < patch` > +===================================== > + > +One feature worth pointing out is the three-way merge, which can help > +if you get conflicts because of renames: Could it not be any other reason besides renames? Maybe it is better to drop "because of renames" here. > `git am -3` will use index > +information contained in patches Because the word "index" is often used in Git in the different meaning (a.k.a cache), I would re-write this sentence to avoid confusion as: "`git am -3` will use information contained in index lines of patches" > to reconstruct a merge base. See If I did not know how git am -3 works, reading this would make me think that git am somehow manage to figure out a common ancestor (commit), while it uses index lines of the patch to learn the identity of the blob that was used as the starting point to create the patch, and if this blob is available locally, git am -3 performs 3-way merge. So, "reconstruct a merge base" is hardly appropriate here. Dmitry -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html