Hi, On Sat, Apr 3, 2021 at 4:08 PM Atharva Raykar <raykar.ath@xxxxxxxxx> wrote: > > Hi all, > > Below is my draft of my GSoC proposal. I have noticed that Chinmoy has already > submitted a proposal for the same idea before me, so would that be considered > "taken"? (I don't think I can submit another proposal for the other idea either, > because someone has already sent one for that as well) Unfortunately, it looks like we will mentor only 2 students on the 2 projects listed on https://git.github.io/SoC-2021-Ideas/, so we might have to make tough choices. > Since I have already put my effort into this for a while, I thought I might as > well send it, but I'll accept whatever the mentors say about the eligibility of > this proposal. Thanks for sending it anyway! > Here is a prettier markdown version: > https://gist.github.com/tfidfwastaken/0c6ca9ef2a452f110a416351541e0f19 > > > --8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-- > > ___________________ > > GSOC GIT PROPOSAL > > Atharva Raykar > ___________________ > > > Table of Contents > _________________ > > 1. Personal Details > 2. Background > 3. Me and Git > .. 1. Current knowledge of git > 4. The Project: Finish converting `git submodule' to builtin > 5. Prior work > 6. General implementation strategy > 7. Timeline (using the format dd/mm) > 8. Beyond GSoC > 9. Blogging > 10. Final Remarks: A little more about me > > > 1 Personal Details > ================== > > Name : Atharva Raykar > Major : Computer Science and Engineering > Email : raykar.ath@xxxxxxxxx > IRC nick : atharvaraykar on #git and #git-devel > Address : RB 103, Purva Riviera, Marathahalli, Bangalore > Postal Code : 560037 > Time Zone : IST (UTC+5:30) > GitHub : http://github.com/tfidfwastaken > > > 2 Background > ============ > > I am Atharva Raykar, currently in my third year of studying Computer > Science and Engineering at PES University, Bangalore. I have always > enjoyed programming since a young age, but my deep appreciation for > good program design and creating the right abstractions came during my > exploration of the various rabbitholes of knowledge originating from > communities around the internet. I have personally enjoyed learning > about Functional Programming, Database Architecture and Operating > Systems, and my interests keep expanding as I explore more in this > field. > > I owe my appreciation of this rich field to these communities, and I > always wanted to give back. With that goal, I restarted the [PES Open > Source] community in our campus, with the goal of creating spaces > where members could share knowledge, much in the same spirit as the > communities that kickstarted my journey in Computer Science. I learnt > a lot about collaborating in the open, maintainership, and reviewing > code. While I have made many small contributions to projects in the > past, I am hoping GSoC will help me make the leap to a larger and more > substantial contribution to one of my favourite projects that made it > all possible in my journey with Open Source. > > > [PES Open Source] <https://pesos.github.io> > > > 3 Me and Git > ============ > > Here are the various forms of contributions that I have made to Git: > > - [Microproject] userdiff: userdiff: add support for Scheme Status: In > progress, patch v2 pending List: > <https://public-inbox.org/git/20210327173938.59391-1-raykar.ath@xxxxxxxxx/> > > - [Git Education] Conducted a workshop with attendance of hundreds of > students new to git, and increased the prevalence of of git's usage > in my campus. > Photos: <https://photos.app.goo.gl/T7CPk1zkHdK7mx6v7> and > <https://photos.app.goo.gl/bzTgdHMttxDen6z9A> > > I intend to continue helping people out on the mailing list and IRC > and tending to patches wherever possible in the meantime. Nice! > 3.1 Current knowledge of git s/git/Git/ > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > I use git almost daily in some form, and I am fairly comfortable with > it. I have already read and understood the chapters from the Git > Book about submodules along with the one on objects, references, > packfiles and the refspec. > > > 4 The Project: Finish converting `git submodule' to builtin > =========================================================== > > Git has historically had many components implemented in the form of > shell scripts. This was less than ideal for several reasons: > - Portability: Non-POSIX systems like Windows don't play nice with > shell script commands like grep, cd and printf, to name a few, and > these commands have to be reimplemented for the system. There are > also POSIX to Windows path conversion issues. > - No direct access to plumbing: Shell commands do not have direct > access to the low level git API, and a separate shell is spawned to > just to carry out their operations. > - Performance: Shell scripts tend to create a lot of child processes > which slows down the functioning of these commands, especially with > large repositories. > Over the years, many GSoC students have converted the shell versions > of these commands to C. Git `submodule' is the last of these to be > converted. > > > 5 Prior work > ============ > > I will be taking advantage of the knowledge that was gained in the > process of the converting the previous scripts and avoiding all the > gotchas that may be present in the process. There may be a bunch of > useful helper functions in the previous patches that can be reused as > well (more investigation needed to determine what exactly is > reusable). > > Currently the only other commands left to be completed for `submodule' > are `add' and `update'. Work for `add' has already been started by a > previous GSoCer, Shourya Shukla, and needs to picked up from there. Yeah, 'update' uses ̀git submodule--helper update-clone`, `git submodule--helper update-module-mode` and other `git submodule--helper` sub-commands, but is not fully ported. > Reference: > <https://github.com/gitgitgadget/git/issues/541#issuecomment-769245064> > > I'll have these as my references when I am working on the project: > His blog about his progress: > <https://shouryashukla.blogspot.com/2020/08/the-final-report.html> > (more has been implemented since) > Shourya's latest patch for `submodule add': > <https://lore.kernel.org/git/20201007074538.25891-1-shouryashukla.oo@xxxxxxxxx/> > > For the most part, the implementation looks fairly complete, but there > seems to be a segfault occurring, along with a few changes suggested > by the reviewers. It will be helpful to contact Shourya to fully > understand what needs to be done. > > Prathamesh's previous conversion work: > <https://lore.kernel.org/git/20170724203454.13947-1-pc44800@xxxxxxxxx/#t> It would be nice if, after finishing 'add' and 'update', you could also completely get rid of git-submodule.sh and instead use `git submodule-helper` as `git submodule`. > 6 General implementation strategy > ================================= > > The way to port the shell to C code for `submodule' will largely > remain the same. There already exists the builtin > `submodule--helper.c' which contains most of the previous commands' > ports. All that the shell script for `git-submodule.sh' is doing for > the previously completed ports is parsing the flags and then calling > the helper, which does all the business logic. > > So I will be moving out all the business logic that the shell script > is performing to `submodule--helper.c'. Any reusable functionality > that is introduced during the port will be added to `submodule.c' in > the top level. Ok. > For example: The general strategy for converting `cmd_update()' would > be to have a call to `submodule--helper' in the shell script to a > function which would resemble something like `module_update()' which > would perform the work being done by the shell script past the flags > being parsed and make the necessary calls to `update_clone()', and the > git interface in C for performing the merging, checkout and rebase > where necessary. It would be nice if you could go into more details about what `module_update()' would look like. Do you see steps that you could take to not have to do everything related to `module_update()' in only one patch? > After this process, the builtin is added to the commands array in > `submodule--helper.c'. And since these two functions are the last bit It's not very clear here that by "these two functions" you reference the 'add' and 'update' sub-commands. > of functionality left to convert in submodules, an extended goal can > be to get rid of the shell script altogether, and make the helper into > the actual builtin [1]. Nice that you are talking about this! > [1] > <https://lore.kernel.org/git/nycvar.QRO.7.76.6.2011191327320.56@xxxxxxxxxxxxxxxxx/> > > > 7 Timeline (using the format dd/mm) > =================================== > > Periods of limited availability (read: hectic chaos): > - From 13/04 to 20/04 I will be having project evaluations and lab > assessments for five of my courses. > - From 20/04 to 01/05 I have my in-semester exams. > - For a period of two weeks in the range of 08/05 to 29/05 I will be > having my end-semester exams. > My commitment: I will still have time during my finals to help people > out on the mailing list, get acquainted with the community and its > processes, and even review patches if I can. This is because we get > holidays between each exam, and my grades are good enough to that I > can prioritise git over my studies ;-) s/git/Git/ > And on the safe side, I will still engage with the community from now > till 07/06 so that the community bonding period is not compromised in > any way. > > Periods of abundant availability: After 29/05 all the way to the first > week of August, I will be having my summer break, so I can dedicate > myself to git full-time :-) > > I would have also finished all my core courses, so even after that, I > will have enough of time to give back to git past my GSoC period. Ok. Also: s/git/Git/ > Phase 1: 07/06 to 14/06 -- Investigate and devise a strategy to port > the submodule functions > - This phase will be more diagrams in my notebook than code in my > editor -- I will go through all the methods used to port the other > submodule functions and see how to do the same for what is left. > - I will find the C equivalents of all the shell invocations in > `git-submodule.sh', and see what invocations have /no/ equivalent > and need to be created as helpers in C (Eg: What is the equivalent > to the `ensure-core-worktree' invocation in C?). For all the helpers > and new functionality that I do introduce, I will need to create the > testing strategy for the same. > - I will go through all the work done by Shourya in his patch, and try > to understand it properly. I will also see the mistakes that were > caught in all the reviews for previous submodule conversion patches > and try to learn from them before I jump to the code. > - Deliverable: I will create a checklist for all the work that needs > to be done with as much detail as I can with the help of inputs from > my mentor and all the knowledge I have gained in the process. > > Phase 2: 14/06 to 28/06 -- Convert `add' to builtin in C > - I will work on completing `git submodule add'. One strategy would be > to either reimplement the whole thing using what was learnt in > Shourya's attempt, but it is probably wiser to just take his patch > and modify it. I would know what to do by the time I reach this > phase. > - I will also add tests for this functionality. I will also document > my changes when required. These would be unit tests for the helpers > introduced, and integration of `add' with the other commands. > - Deliverable: Completely port `add' to C! > > Bonus Phase: If I am ahead of time -- Remove the need for a > `submodule--helper', and make it a proper C builtin. > - Once all the submodule functionality is ported, the shell script is > not really doing much more than parsing the arguments and passing it > to the helper. We won't need this anymore if it is implemented. Ok, great!