Thanks for the suggestions so far. I've updated the notes. @Peff: Thanks especially for pointing me towards Junio's presentatation. That's an excellent source. Here's the patch for your suggestions: diff --git a/scmOutline.txt b/scmOutline.txt index 1791fa0..d25198c 100644 --- a/scmOutline.txt +++ b/scmOutline.txt @@ -1,4 +1,4 @@ -SCM: Distributed, Centralized, and Everything in Between. +SCM: Centralized, Distributed, and Everything in Between. * What is SCM and Why is it Useful? @@ -20,7 +20,11 @@ Not only is it unlimited, but it's random access. If you changed a function a w Many people can edit the same code base at the same time and know, without a doubt, that when they pull all those changes together, the system will merge the content intelligently or inform you of the conflict and let you merge it. You don't need to lock files. Obviously, if there is bad coordination then the possibilities of conflicts rise, but this should not happen regularly. -*** Diff Debugging +*** Software Archeology + +With a proper SCMS, it becomes a somewhat trivial operation to discover the author and reasons for a given change. This is because of the rich metadata associated with commits (author, date, complete change set, diffs, and commentary). So rather than wandering asking if anyone remembers doing something and why, you simply commit that information into the system and then refer to it when you need to. + +**** Diff Debugging You can find where a bug was introduced by learning how to reproduce the bug and then doing a binary chop search back through the History to come to the exact commit that introduced the bug. @@ -30,11 +34,11 @@ You can find where a bug was introduced by learning how to reproduce the bug and The more you commit, the more fine grained control you have over the undo feature of SCM. Most documents that I have read suggested a TDD approach wherein you commit whenever you have written just enough code for your test to pass. But... -** Don't Commit Broken Code (To the Public Tree) +** Don't _Publish_ Broken Code Of primary concern is the fact that your central HEAD should _always_ build. This is why practices like Continuous Integration and TDD are so important. TDD gives you the freedom to be sure that a change you made hasn't broken anything you weren't expecting it to break. Continuous Integration allows you to be sure that your whole system will build every time. Thus, you should _never_ commit broken code to the (public) tree. -Of course, in a centralized system, committing is intrinsically public. Even on branches, every time you commit any sort of change, everyone is able to see it and so you could be breaking the build for someone (even if it's just yourself and the build system). One of the nice features of a distributed system is that your public/private ontology is much richer and thus allows you to have broken code in your SCMS. +Of course, in a centralized system, committing is intrinsically public. Even on branches, every time you commit any sort of change, everyone is able to see it and so you could be breaking the build for someone (even if it's just yourself and the build system). One of the nice features of a distributed system is that your public/private ontology is much richer and thus allows you to have broken code in your SCMS, so long as you haven't published it, at no penalty to anyone but yourself. ** Whole Hog @@ -130,7 +134,9 @@ Once you've published, however, not much changes. Almost everything except upda *** Natural Backup -Because every developer has a copy of the repository, every developer you add adds an extra failure point. The more developers you have, the more backups you have of the repository. +Because every developer has a copy of the repository, every developer you add adds an extra layer of redundancy. The more developers you have, the more backups you have of the repository. + +An important point to make clear here is that you only are backing up what everyone is duplicating. If you have 10 unpublished branches that no one else has cloned, then those are obviously not backed up. However, the idea here would be that anything that is being developed actively by multiple people is backed up by as many developers. Other than that, your private data must be backed up by you (which is what you do anyway, right? ;). *** Must Learn New Work Flows. @@ -148,6 +154,8 @@ This bears some explanation. Within a distributed system, you can have a single Git's implementation just happens to be wickedly fast. It's faster than mercurial, it's faster than bazaar, etc. Everything, committing, merging, viewing history, branching, and even updating and and pushing are all faster. +This is much more important than just shaving a few seconds off the operations. Because Git is so much faster, you begin to do things differently because of how fast it is. Git's blazing fast branching and merging wouldn't matter at all if you never branched and merged (which is possible), but because their blazing fast you _should_ begin to branch and merge much more often, which __does__ fundamentally change the way you develop your code (hopefully for the better). + ** Tracks Content, not Files Git tracks content, not files, and it's the only SCMS at the moment that does this. This has many effects internally, but the most apparent effect I know of is that for the first time Git can easily tell you the history of even a function in a file because Git can tell you which files that function existed (or does exist) in over the course of development. @@ -171,9 +179,9 @@ This is very powerful yet somewhat awkward to grasp. Basically, the upshot of t I've found this to be particularly useful when working with an existing code base that was not properly formatted. Often, I'll come to a file that has a bunch of wonky white space choices and improperly indented logical constructs and I'll just quickly run through it correcting that stuff before continuing with the feature I was working on. Afterwords, I'll stage the formatting and commit it, and then stage the feature I was working on and commit that. You may not want that kind of control (and if you don't, you don't need to use it), but I like it. -** Excellent Merge algorithms +** Stupid but _Fast_ Merge Algorithms -Git has excellent merge algorithms. This is widely attributed and doesn't require much explanation. It was one of Git's original design goals, and it has been proven by Git's implementation. Merging in Git is _much_ less painful than in other systems. +Merging in Git is _much_ less painful than in other systems. This is mainly because of how fast it is and how much data it remembers when it does a merge. As opposed to CVS which can't merge a branch twice because it doesn't remember where the last merge happened, Git keeps track of that information so you can merge between branches as much as you want. Git's philosophy is to make merging as fast and painless as possible so that you merge early and often enough to not develop really bad conflicts that are nearly impossible to resolve. ** Has powerful 'maintainer tools' @@ -196,3 +204,4 @@ Git guarantees absolutely that if corruption happens, you will know about it. I - <http://svnbook.red-bean.com/> - Rolling publish book on Subversion. Chapter 1 is a good introduction to general centralized SCM concepts and principles. - <http://www.perforce.com/perforce/bestpractices.html> - An excellent set of best practices from the Perforce team. Some of it (especially the branches) has a distinct centralized lean, but most of it is quite good. - <http://www.bobev.com/PresentationsAndPapers/Common%20SCM%20Patterns.pdf> - Interesting presentation by Pretzel Logic from 2001 attempting to outline some common SCM best practices as Patterns. +- <http://members.cox.net/junkio/200607-ols.pdf> - A presentation by Junio Hamano (the Git maintainer) at a Linux symposium on what Git is with some tutorials. I've also attached it as a file. It was generated by `git diff -p`. I'm also looking for anyplace where I'm technically inaccurate. Unfortunately, I've written a lot of this from things that I've either read or heard. I'm mainly experienced with VSS and Subversion (and both of those to a very small degree), and making a lot of progress with Git. I've kind of been swept away by all the energy surrounding git right now, though, so I'm sure my judgement is somewhat clouded. Thanks again for your help! -- In Christ, Timmy V. http://burningones.com/ http://five.sentenc.es/ - Spend less time on e-mail
Attachment:
suggestionsPatch01
Description: Binary data