Hi, On Tue, Mar 19, 2024 at 11:55 PM Ghanshyam Thakkar <shyamthakkar001@xxxxxxxxx> wrote: > > Hello everyone, > > This is my proposal for the project "Move existing tests to a unit testing > framework" in Google Summer of Code 2024. The web version can be viewed at: > https://docs.google.com/document/d/1wmosedy-UKd_mhAAjccv1qETO1q00s-npYj2m6g-Hd0/edit?usp=sharing > > I appreciate any feedback on this proposal. > > Thanks. > > -- >8 -- > Move existing tests to a unit testing framework > > Personal Information > -------------------- > > Name: Ghanshyam Thakkar > E-mail: shyamthakkar001@xxxxxxxxx > Ph. No.: +91 7990974044 > > Education: L.D. College of Engineering, Gujarat, India > Year: IV Is it year 4 out of 4 or 5 or more? Sorry, but it's interesting to know when you are going to graduate and if you might need to look for a full time job soon. > Degree: Bachelors in Electronics and > Communications Engineering > > GitHub: https://github.com/spectre10 > Time-zone: UTC +5:30 (IST) > > > Overview > -------- > > Initially Git used to only rely on shell scripts to test the functionality > usually deemed to be internal (e.g. libraries, functions etc.) This was achieved > through t/helper binaries which would take input via stdin and output through > stdout and the test scripts would consume those outputs. I think some test-tool helpers also take arguments on the command line. > This translation is > costly due to the following reasons: > > - Each test script creates a new git repository which is many times not needed. > See here[1] for example. > - These scripts spawn processes every time they need a certain output. And > spawning processes are expensive (especially on windows). > - Difficult setups: sometimes the functions or the libraries we are testing need > some special setup such as defining flags for custom behavior etc. which > cannot be done through the shell scripts, hence we might forego testing those > conditions. > > Moving to the unit testing framework would give us many benefits such as > increased runtime performance, and therefore lower test duration and solves all > of the above points. The choice of framework and the rationale behind it is > thoroughly described in Documentation/technical/unit-tests.txt. Therefore, the > goal of this project is to move existing unit tests from t/helper/* and its > corresponding scripts to the new unit testing framework. And there has been work > going on to do this by Outreachy intern Achu Luma. > > The expected project difficulty is medium and the expected project size would be > 350 hours. > > > Pre-GSoC > -------- > I initially got into Git’s codebase in December, 2023 and started my > contribution journey in January, 2024. I worked on mostly what I found > interesting and within my reach. Following is the list of contributions that I > have made, in chronological order: > > - [PATCH v6 0/2] t7501: add tests for --include, --only, --signoff > Status: merged into master > Merge Commit: 2be9ccf23e0592b83b78e719574e580e79db0375 > Description: > This patch series added missing tests for --include, --only and --signoff > options of git-commit. While working on this patch I accidentally reproduced > a bug where in an untracked path provided to the commit command with -i > would not error out. This was noticed by Junio. Therefore, this patch also > included a TODO comment describing this bug. > Mailing list thread: > https://lore.kernel.org/git/20240109060417.1144647-2-shyamthakkar001@xxxxxxxxx > > - [PATCH v3 1/2] t0024: avoid losing exit status to pipes > Status: merged into master > Merge Commit: 40225ba8b494080d7f90dd5e129daa7b11c613d1 > Description: > Although this was not my first contribution, it was done to fulfill the > microproject criteria from the list of micro-projects on Git Developer > Pages. The first patch replaces the pipe to the redirection operator to > store the output to a temporary file to avoid losing exit status of the git > commands. And the second patch of the series fixes the style violations to > adhere to the current standard described in CodingGuidelines. > Mailing list thread: > https://lore.kernel.org/git/20240118215407.8609-1-shyamthakkar001@xxxxxxxxx/ > > - [PATCH v6 0/2] add-patch: classify '@' as a synonym for 'HEAD' > Status: merged into master > Merge Commit: 65462776c2963b377968e580ace1590238bf79c1 > Description: > The first patch of this series removed the disparity of command line output > of using ‘HEAD’ vs ‘@’ in ‘git reset/restore/checkout -p’. And the second > patch removed the perl prerequisites in the test scripts which use the patch > mode functionality. > Mailing list thread: > https://lore.kernel.org/git/20240128181202.986753-2-shyamthakkar001@xxxxxxxxx/ > > - [PATCH] restore: allow --staged on unborn branch > Status: Discontinued > Description: > This patch added functionality to assume an empty tree when on an unborn > branch to be able to use ‘restore –staged’. However, this was discontinued > due to lack of community interest. In hindsight, I think that this would > have been an acceptable patch, had I described it well. > Mailing list thread: > https://lore.kernel.org/git/20240206230357.1097505-2-shyamthakkar001@xxxxxxxxx/ I think it's still Ok to send a V2 with improved explanations. Perhaps you can do it after improving this proposal though. > - [PATCH] unit-tests: convert t/helper/test-oid-array.c to unit-tests > Status: on hold until GSoC > Description: > This was used as a practice patch to the current project proposal. This will > be used as a reference in the proposal ahead. > Mailing list thread: > https://lore.kernel.org/git/20240223193257.9222-1-shyamthakkar001@xxxxxxxxx/ > > - [PATCH v2] setup: remove unnecessary variable > Status: Merged to master > Merge Commit: 272fd9125aae215ece1c465a4b7d336cf81c3e62 > Description: > I initially proposed the TODO comment removed through this patch as a > micro-project for my first patch. However, the TODO comment was a result of > misunderstanding, therefore this patch did some minor code simplifications > along with the removal of the TODO comment. > Mailing list thread: > https://lore.kernel.org/git/20240304151811.511780-1-shyamthakkar001@xxxxxxxxx/ > > - [PATCH 0/2] commit, add: error out when passing untracked path > Status: WIP > Description: > This patch series would fix the potential bug discovered in the first patch > that I sent (--include tests). > Mailing list thread: > https://lore.kernel.org/git/20240318155219.494206-2-shyamthakkar001@xxxxxxxxx/ > > - Helped a fellow contributor by reviewing their patches > Thread links: > - https://lore.kernel.org/git/CZG3HO25QLAG.3Q9V03SCO99HL@xxxxxxxxx/ > - https://lore.kernel.org/git/CZHJPF604DV9.X0A0VX1AB7P8@xxxxxxxxx/ > > Proposed Project > ---------------- > > ----What’s unit testing---- > > Unit testing is a method to test a specific unit of source code (i.e. a > function, a module etc.). This is very helpful in catching all of the corner > cases and possible scenarios. > > ----Steps to convert the existing unit tests to the new framework---- > > It can be thought of as a series of steps which are outlined below: > - Identify a suitable candidate from t/helper directory. > - Each t/helper unit test has a corresponding shell script which actually tests > the output from the binary. Each test-case in that script would ideally become > a separate function (and therefore a single test-case) in the migrated unit > test. > - Convert the test logic from the shell script to C. This would involve > understanding the underlying library/functionality which is being > tested. Subsequently, make functions which would be ideally be a single > testcase. > - These functions can be called via the TEST() macro in the cmd_main > function which is the entry point to the test. > - Each of these functions contains a set of {check(), check_int() etc.} macro > functions which check the condition provided to them and if it fails, prints > a message to the stdout. If even a single check fails the whole test-case is > considered a failure. > - Modify Makefile to add the reference of the new test to the UNIT_TEST_PROGRAMS > variable. > - The skeleton of the new unit test would look something like this: > > #include "test-lib.h" > > static void testcase_1() > { > ... > check(actual == expected); > ... > } > > int cmd_main(int argc, const char **argv) > { > TEST(testcase_1(), "test description"); > return test_done(); > } > > - After the migration remove the references of t/helper/{old-test-helper}.{c, o} > from the Makefile and from the t/helper/test-tool.{c, h}. And subsequently > remove the shell script which used to consume the old helper binary. > > ----Additional Notes along with a Reference Implementation---- > > These tests usually contain duplicate logic with variations in input and output. > In such cases, we can define macro functions to not duplicate such logic. I.e. > in t-ctype, we can see how a custom macro is defined to avoid duplication of > test logic, with variation only in the input being used. Also, some these s/some these/some of these/ > existing tests are not worth migrating. For example, test-tool binaries which > are also used outside of the corresponding shell script (i.e. used in other > tests for setup purposes) would be low on the priority list of possible > migrations. > > I have made a reference implementation of t-oid-array.c which migrates > t/helper/test-oid-array.c and t/0064-oid-array.sh to the new framework, which > can be viewed here[2]. Note that the failing tests are not related to my > implementation and rather a known issue[3]. It may or may not to be good enough to > be merged into master, however it sufficiently showcases my abilities to > understand framework, the use of different macros and the ability to understand > different internal libraries and data-structures such oidarray itself. > > This implementation was also reviewed by Christian Couder on the mailing list > and I have made some adjustments according to the feedback. > > This implementation uses a macro to test the ordered enumeration of oidarray > with different inputs. And also tests the lookup capabilities of oidarray with > different queries and inputs with the help of a macro. There are also various > helper functions defined to help with printing debug output, generating > hexadecimal representations, etc. > > ----Previous Work---- > > TAP implementation along with t-basic and t-strbuf: > https://github.com/git/git/commit/e137fe3b > > t-mempool: https://github.com/git/git/commit/6cbae640 > t-ctype: https://github.com/git/git/commit/e875d451 > t-prio-queue: https://github.com/git/git/commit/808b77e5 > > Achu Luma has been doing this migration as part of Outreachy internship. Their > branches can be seen here[4]. > > ----Goal---- > > The end goal of this project would be to migrate as many unit tests as possible > according to the steps outlined above. In doing so, also enhance some of the > tests as needed to be extensible such as the ability to test different hash > algorithms on a single oidarray. This enhancement is not yet implemented in the > reference implementation due to the series which enables the use of different > hash algorithms on oidarray not yet merged into master (in next though). > > Currently, there are 78 test-*.c files (not including test-tool.c) in t/helper/ > directory according to the master branch. Of course, migrating all of them would > be a difficult task in 350 hours, therefore I plan to migrate around 1 unit test > per week during the Community Bonding and Coding Phase. This may fluctuate > depending on the type of test. Some of those tests are already being worked on > by Achu Luma. I think it would be interesting to know which of these 78 test-*.c files you would like to migrate first, especially as some of them might not be worth migrating as you said earlier. > ----Timeline---- > > Pre-GSoC > (Until May 1) > - Continue to work on different things like solving the bug discovered in the > first patch series that I sent. Be engaged in the community. > > Community Bonding > (May 1 - May 26) > - Talk with mentors and identify the tests which are suitable for migration. > Learn the libraries which are used for those tests and read up on the API. > Look at Achu Luma’s branches to further familiarize myself with the migration > process. Start the migration early with the mentors’ permission. > > Phase I > (May 27 - July 11) > - Continue implementing the tests in the unit testing framework with the > guidance of mentors and send out patches to the mailing list for community > approval. Approximately 6-7 tests would be done/cooking for migration at > the end of this Phase. > > Phase II > (July 12 - Aug 18) > - Continue the migration and talk with the mentors about the pace, quality of > patches and seek feedback. And enjoy the process. Another 5-6 tests would be > done/cooking at the end of this Phase. > > Final Week > (Aug 19 - Aug 26) > - Finish up any remaining tests. > - Fix any bugs and some final touches. > - Make a final report about the work accomplished and outline future work. > > ----Blogging---- > > I believe that writing about one’s work helps them track their progress from a > bird’s eye view. And also helps them realize if they are making mistakes from a > long term perspective. I plan to blog about the progress bi-weekly. Although I > have not setup the blog yet, it will be at: > https://spectre10.github.io > > ----Availability---- > > My current semester ends on 30th April and my exam/viva is tentatively on 7th > May. After that I have freed my summer for GSoC, therefore I will be able to > give a minimum of 35-40 hours per week. > > Post-GSoC > --------- > > The Git Community has helped me very much and tolerated my beginner mistakes. > I have grown as a developer since I started contributing to Git. I learned the > importance of writing good commit messages, benefits of splitting up patches, > effective programming style and good communication etiquettes. Therefore, I plan > to be involved even after my GSoC journey ends. I am very much involved in the > Git ecosystem and also contribute to another project in the Git ecosystem called > gittuf[5]. Also post-gsoc, I am planning to help out in aggregating for Git Rev > News, which I personally enjoy reading. > > My Appreciation > --------------- > > I would like extend my thanks to all of the folks on the mailing list and > Discord who have helped me in contributing to the Git Project including Junio, > Christian Couder, Phillip Wood, Elijah Newren and many others. And I have also > taken some references of Achu Luma’s work for this proposal, so thanks to > her, too. > -- After GSoC -- > --------------------- > > After GSoC I intend to be a part of the community and keep > contributing to the git’s codebase. I eagerly want to learn a lot from > the git community and enhance my skills. I also see myself making > important contributions to git in the future. > > When I first joined the community 2 months ago, the ancient way of > collaborating through a mailing list by sending diff patches was > really puzzling (GitHub was the only means that I knew about for > open-source collaboration). After sending a patch I got a little > comfortable with it and started loving it. > > -- Closing remarks -- > ---------------------------- > > I am really enthusiastic to contribute to git. I really want to learn > a lot from my mentors and the whole git community. Everyone > contributes using git, why not contribute to git :). > > In the end I really want to thank the whole community, especially > Christian and Junio, for their valuable time in reviewing my patches > and guiding me with the problems I faced. Thanks for your proposal. Please make sure you submit it soon as a pdf file to the GSoC website. It can then be updated by uploading a new pdf until the April 2 1800 UTC deadline.