On 03/10/2014 01:10 PM, Johan Herland wrote: > It should be possible to teach Git to do similar things, and IINM > there are (and have previously been) several attempts to do similar > things in Git, e.g.: > > - http://thread.gmane.org/gmane.comp.version-control.git/240339 > > - http://thread.gmane.org/gmane.comp.version-control.git/217817 > > I haven't looked closely at these attempts (it is not my scratch to > itch), and I don't know if/how they would work on top of Watchman, but > in principle I don't see why Git shouldn't be able to leverage > Watchman the same way Mercurial does. This touches on the most important thing that we should take to heart from this episode: Of course Facebook could have modified either Git or Mercurial to do what they want. Why did they pick Mercurial? The article seems to claim that they were initially biased towards Git, but they chose Mercurial because its code base is easier to modify. This is a claim that I can easily believe. The two projects are almost exactly the same age. The number of commits in the two projects is similar. Mercurial has had fewer contributors active at any given time over its project lifetime. But let's see how much code is in the main part of Mercurial vs. Git: $ find mercurial hgext \( -name '*.c' -o -name '*.py' \) -print | xargs cat | wc -l 46164 $ cat *.c *.h *.sh *.perl builtin/*.c | wc -l 188530 These are just crude estimates and I hope I got the right directories for Mercurial. But, by these numbers, Git has 4 times as much code as Mercurial. That alone will go a long way to making Git harder to modify. I don't think that Git has anywhere near 4 times the features of Mercurial. Probably most of the difference can be explained by the choice of implementation languages; 94% of the code in these hg directories is Python, whereas 88% of Git's core code is C. How can we make Git easier to hack (short of switching languages)? Here are my suggestions: * Better function docstrings -- don't make developers have to read the whole call stack to find out what a function does, or who owns the memory that is passed around. * More modularity -- more coherent and abstract APIs between different parts of the system, and less pawing around in your neighbor's data structures. * Higher-level abstractions -- make more use of APIs like strbuf and string_list as opposed to handling every malloc() and realloc() by hand. I personally wish that we as a project would be more willing to spend a few extra CPU microseconds to make our code easier to read and modify and more robust. Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html