Hello everyone. I sent out an email here last week, asking for a list of resources, so I could better understand the workings and design of git. I really appreciate everyone, who gave the links and their advice. I have been reading about GIT for some time now, and have looked at almost all of the resources plus some others. I think I could say, I now have a decent conceptual understanding of how GIT works internally. (Also, I understood the chapter about git I read in the book I am reading, Architecture of Open Source Applications: Volume 2, which I didn't understand at all, the reason I started this thread). Although there must definitely be a lot of details and subtle things I may not understand yet (like branches are nothing but pointers to commits, wow! btw) Now, continuing this discussion, and talking about the implementation and engineering side of things, I wanted to ask another question and hence wanted some advice. Though I may understand the internal design and high-level implementation of GIT, I really want to know how it's implemented and was made, which means reading the SOURCE CODE. 1. I don't know how absurd of a quest this is, please enlighten me. 2. How do I do it? Where do I start? It's such a BIG repository - and I am not guessing it's going to be easy. 3. Would someone advise, perhaps, to have a look at an older version of the source code? rather than the latest one, for some reason. Again, I would really appreciate it if someone could give their thoughts on this. Thank you, Regards, Aman On Mon, May 30, 2022 at 7:40 PM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > > > On Mon, May 30 2022, Konstantin Khomoutov wrote: > > > On Mon, May 30, 2022 at 09:49:57AM +0000, Kerry, Richard wrote: > > > > [...] > >> > > 1. I haven't had the experience of working with other (perhaps even > >> > > older) version control systems, like subversion. So when refering to > >> > > the "control" aspect, > >> > > >> > The "control" aspect was from whoever was the 'manager' that limited > >> > access to the version system (i.e. acting like a museum curator), and deciding > >> > if your masterpiece was worthy of inclusion as a significant example of your > >> > craft, whether that was an engineering drawing or some software code. > >> > >> I'm not sure I get that idea. I worked using server-based Version Control > >> systems from the mid 80s until about 5 years ago when the team moved from > >> Subversion to Git. There was never a "curator" who controlled what went > >> into VC. You did your work, developed files, and committed when you thought > >> it necessary. When a build was to be done there would then be some > >> consideration of what from VC would go into the build. That is all still > >> there nowadays using a distributed system (ie Git). Those doing Open source > >> work might operate a bit differently, as there is of necessity distribution > >> of control of what gets into a release. But those of us who are developing > >> proprietary software are still going through the same sort of release > >> process. And that's even if there isn't actually a separate person actively > >> manipulating the contents of a release, it's just up to you to do what's > >> necessary (actually there are others involved in dividing what will be in, > >> but in our case they don't actively manipulate a repository). > > > > I think, the "inversion of control" brought in by DVCS-es about a bit > > differet set of things. > > Re the "I'm not sure I get that idea" from Richard I think his point > stands that some of the stories we carry around about the VCS v.s. DVCS > in free/open source software was more particular to how things were done > in those online communities, and not really about the implicit > constraints of centralized VCS per-se. > > Partly those two mix: It was quite common for free software projects not > to have any public VCS (usually CVS) access at all, some did, but it was > quite a hassle to set up, and not part of your "normal" workflow (as > opposed setting up a hoster git repository, which everyone uses) that > many just didn't do it. > > > I would say it is connected to F/OSS and the way most projects have been > > hosted before the DVCS-es over: usually each project had a single repository > > (say, on Sourceforge or elsewhere), and it was "truly central" in the sense > > that if anyone were to decide to work on that project, they would need to > > contact whoever were in charge of that project and ask them to set up > > permissions allowing commits - may be not to "the trunk", but anyway the > > commit access was required because in centralized VCS commits are made on the > > server side. > > We may have tried this in different eras, but from what I recall it was > a crapshoot whether there was any public VCS access at all. Some > projects were quite good about it, and sourceforge managed to push that > to more of them early on by making anonymous CVS access something you > could get by default. > > But a lot of projects simply didn't have it at all, you'll still find > some of them today, i.e. various bits of "infrastructure" code that the > maintainers are (presumably) still manually managing with zip snapshots > and manually applied patches. > > > (Of course, there were projects where you could mail your patchset to a > > maintainer, but maintaining such patchset was not convenient: you would either > > need to host your own fully private VCS or use a tool like Quilt [1]. > > Also note that certain high-profile projects such as Linux and Git use mailing > > lists for submission and review of patch series; this workflow coexists with > > the concept of DVCS just fine.) > > I'd add though that this isn't really "co-existing" with DVSC so much as > using patches on a ML as an indirect transport protocol for "git push". > > I.e. if you contributed to some similar projects "back in the day" you > could expect to effectively send your patche into a black-hole until the > next release, the maintainer would apply them locally, you wouldn't be > able to pull them back down via the DVCS. > > Perhaps there would be development releases, but those could be weeks or > even months apart, and a "real" release might be once every 1-2 years. > > Whereas both Junio and Linus (and other linux maintainers) publish their > version of the patches they do integrate fairly quickly. > > > [...] it also has possible > > downsides; one of a more visible is that when an original project becomes > > dormant for some reason, its users might have hard time understanding which > > one of competing forks to switch to, and there are cases when multiple > > competing forks implement different features and bugfixes, in parallel. > > One of the guys behind Subversion expressed his concerns about this back then > > wgen Git was in its relative infancy [2]. > > > > 1. https://en.wikipedia.org/wiki/Quilt_(software) > > 2. http://blog.red-bean.com/sussman/?p=20 > > It's interesting that this aspect of what proponents of centralized VCS > were fearful of when it came to DVCS turned out to be the exact > opposite: > > Notice what this user is now able to do: he wants to to crawl off > into a cave, work for weeks on a complex feature by himself, then > present it as a polished result to the main codebase. And this is > exactly the sort of behavior that I think is bad for open source > communities. > > I.e. lowering the cost to publish early and often has had the effect > that people are less likely to "crawl off into a cave" and work on > something for a long time without syncing up with other parallel > development.