On Wed, Apr 20, 2022 at 10:55 PM Jack McGuinness <jmcguinness2@xxxxxxxxxxxx> wrote: > > Unify ref-filter formats with other pretty formats, for the Fourth Time Thanks for your proposal! > For my microproject, I chose to Modernize a test script, specifically t4202-log.sh > > I will admit, I had a bumpy run completing my Microproject. Going into this, I had no experience with open-source, and due to this, my first attempt was done incorrectly. In response to this, I went back and redid it, however as of right now it has not been reviewed by anyone. I believe that I corrected all of the improper syntaxes, but it’s entirely possible I missed something. > > Note: I used gitgitgadget to submit my patches. I just took a look at version 2 and sent a reply with a few comments. Note that you don't need for all the patches to be fully reviewed before sending another version. > History of Problem: > > As most open source projects get developed, git has been built up over time by many different people, s/git/Git/ working towards a common goal of improvement, but all with different ideas of how to get there. While this is a great thing that is the reason open source software is such a brilliant idea, it can cause confusion within the project. > A prime example of this is formatting command output, where different commands that overlap in what data they would output have different logic for getting said output, which causes people to need to know separate systems for each command. > Git has had a history of having mentored contributors work to amend this, starting in Outreach Round 15, s/Outreach/Outreachy/ Also it might be nice to say when was that Outreachy round. > where Olga Telezhania mainly worked to migrate the logic in cat-file.c to the logic in ref-filter, but unfortunately near the end, she was forced to scrap her solution and restart, and her main work was never merged to master. You might want to explain a bit more the reason for that. > From what I could find, the next person to work on this was Hariom Verma during GSoC 2020. He started his project by looking over the work of his predecessor Olga, and deciding to take a different approach to the problem. Over Hariom’s summer, he implemented a plethora of formatting options in pretty formats, implementing all formatting options in pretty-lib. Hariom's project was not quite the same as Olga's. Olga's was focused on `git cat-file` while Hariom's wasn't. > After Hariom, ZheNing Hu carried the torch and worked on the project for the 2021 GSoC. His main contributions revolved around refactoring cat-file, similar to Olga. Yeah, ZheNing's project was focused on `git cat-file` like Olga's. > However, he also made the notable decision to spend time optimizing the performance of ref-filter. It would be nice if you could say why this decision was made. > Proposal: Unify ref-filter formats with other pretty formats > > My proposal is one of the ideas provided by Git-SCM, unifying the logic of ref-filter with pretty formats. What this means, is that I would be rewriting the formatting logic currently used in ref-filter, to be used in pretty. However, alongside doing this, I also have the goal of adding some small new functionality to the formatting, and possibly optimizing the logic as ZheNing did. Your project is more like Hariom's than Olga's and ZheNing's as it's not focused on `git cat-file`. If you can also finish what Olga and ZheNing started, that would be a really nice bonus outcome though. > Benefits of Proposal: > > Completing this would improve the quality of life of people contributing to the formatting. The erasure of duplicate logic would make it simpler to understand the logic being used to format, It's likely that the old formatting logic will have to be kept for a long time for backward compatibility and not breaking existing users though. > and therefore simpler for a prospective contributor to implement a new feature, or alter a current one. > > My Plan: > > I looked over the proposals and blogs of the previous undertakers of this proposal, and with their work and struggles in mind, I have compiled the following plan. > Before I start working on the formatting logic, I want to learn to a usable degree how to use the following tools: > Valgrind > GDB > Tmux > Possibly Gprof and perf > My reasoning for this is that reading Olga's blog, she commented how if she had started using the debugging assets earlier, she would have been far more on track. I want to go into this already knowing them so that I can apply them when needed, and not needlessly waste time. > After this, I would focus on understanding the logic behind the formatting, by studying the relevant files and working on small contributions and patches, to better understand the system in place. From what I can see, most previous contributors took around a month before they started coding their main project. If possible, I would like to figure this out in under a month, but I know that that’s easier said than done. > After I understand, to a well enough degree, the formatting logic, I want to start implementing the formatting options from other files. I am not sure what "the formatting options from other files" refers to. > As I understand it, major progress has been made in unifying the formatting logic, however, there are still implementations that work separately from what we want the standard to be. Ideally, I would like to spend the majority of my time doing this, along with the debugging that goes along with it. > If I have misjudged the amount of logic left to be rectified, then my plan for my time would be to work to erase the current problems with the formatting logic. > Hariom mentioned that the following problems persisted after he finished GSoC: > 30% of log tests are failing > Pretty-lin.{c,b} does not have apt handling for incorrect formatting > Olgas work needs attention > A Lot of what ZheNing worked on covers the third problem, so I would like to tackle the first two. Yes, please. > The reason for the first problem was due to the branch it was tested on not having mailmap logic, and also the second problem influenced it. Because of this, I think If I go this route, my first step would be to implement incorrect formatting handling. A simple form of this is already implemented, however it currently causes a segmentation fault, which would need to be debugged. Ok. > Prior Commitments: > To be completely transparent, over the summer I already have a job, however, it is a part time job at my local fair grounds where I help out in the mornings. It doesn’t take too much time out of my day, I just want to be transparent about it. Thanks for being transparent about it... > I also will be spending a few days attending my cousin’s wedding in June, but I will be able to work on the project during this time, except for the day of the actual wedding. ...and about this. > Projected Timeline: > > Week > Goal > Prior to work start > In the time I have before the official start of work, I want to get to know the community better, and gain a good understanding of the workflow. Alongside this, I want to look into the aforementioned software tools, and also read Pro Git Book, as Olga said the later chapters were invaluable to her understanding. > 1 - 3 > During the first three weeks, I plan to spend a majority of my team looking into the formatting logic. It is an important step, and If I start working without knowing exactly what I’m doing then I could make a mistake and end up costing more time then I intend to. During this time I want to make small patches and contributions, to keep me in practice and help me develop my relationship with the community. > 4-9 > At this point, I want to start my actual work of unifying logic. At this point, I’m unsure of what file I would start with, but I believe that during the time allotted previously, I would be able to figure this out. > 10-12 > I am leaving this time period for debugging and optimization. It is inevitable that what I make won’t work in some unexpected way. In order to best improve my chances of having a master ready project by the end, I want to make sure I have the time to take the work I've done over the summer and turn it into a polished final product. The issue with that kind of timeline is that reviewers are not likely to accept your patches if they look buggy or not polished or optimized enough. So if you plan to only polish, fully debug and optimize towards the end of your GSoC time, it is likely that nothing will get merged before that time. Then if you are late (for example because early steps took more time than planned) and decide for some reason (which might be a very valid one) to stop working on the project at the end of the GSoC time, it might mean that nothing will have been merged. So I think it would have been better to split the time in a way similar to: - weeks 1 - 4: improve incorrect formatting handling - weeks 5 - 8: add mailmap support - weeks 9 - 12: fix all remaining issues where at the end of each of these steps hopefully something can be merged. > After GSoC > After GSoC ends I will go back to school, which will limit the amount of time I have available. However even so, I plan to stay connected to the git community in some way. At the very least, I plan to watch the mailing list, and provide commentary to other peoples patches, and at the most I would want to keep working on a major part of git, and finish what I started. Nice! > Blogging: > I am currently a stranger to blogging, as I never thought of any reason somebody would want to read my thoughts. However, I do keep a private journal that I use to remember what I do each month, and plan out the next month, which I think can be translated to a blog. We don't absolutely require blogging (especially public blogging), but we think it can help you both during and after your GSoC. > If I am accepted, then I plan to have 14 total blog posts over the course of the project. 12 of them to summarize what I did each week, 1 before I do any work to provide a reference point for me at the end, and to help me collect what I am going in knowing, and 1 at the end, to be the summation of my experience, and describe my experience and work in full. Nice! > Motivation: > My motivation for participating in GSoC: > I have always wanted to participate in an open source project, but I never knew how to take the first step. At times I considered contributing to some projects, but I was worried that my commits would get ignored, I would do it wrong and waste the time of people. When I say Google had a program s/say/saw/ > to connect people to mentors and help get new developers into open source, I thought “Wow, that sounds like exactly what I’m looking for.” I didn’t even find out about the stipend until later, which is an obvious plus. > > My “Why Git?”: > When the list of organizations participating in GSoC 2022 came out, I went to the list, and compiled my own list of organizations I knew and would want to contribute to. I ended up creating a list of around 10 organizations, but when I looked at it, I just knew that Git stood out among them. It’s something that I use almost daily, and that I have always wanted to know more about the internals of. At that point I started looking into how to apply. To me Git is a backbone of all other programming projects, and exists as a testament to what open source can be. > > Closing remarks: > > I would be more than overjoyed if I can be accepted, but even aside from that, I think that I have already learned a lot from GSoC. The materials provided by Google and Git have given me a lot of advice and ideas for how I can personally contribute to something open source. In the case that I don’t get accepted, then I will still spend my summer contributing to open source. I may branch out a bit and focus on more then just git, bit s/then just git, bit/than just Git, but/ > the idea of contributing to a public project just excites me, and I know I have to follow through with it. Great, thanks!