Hello Christian, Thank you for the review. I have made the changes as per your suggestions. For your convenience, I have also attached the diff of the changes below. The web version can be viewed at: https://docs.google.com/document/d/1wmosedy-UKd_mhAAjccv1qETO1q00s-npYj2m6g-Hd0/edit?usp=sharing I appreciate any feedback on this proposal. Thanks. --- diff --git a/proposal b/proposal index de751da..6ed42e0 100644 --- a/proposal +++ b/proposal @@ -21,7 +21,7 @@ E-mail: shyamthakkar001@xxxxxxxxx Ph. No.: +91 7990974044 Education: L.D. College of Engineering, Gujarat, India -Year: IV +Year: IV (Final Year) Degree: Bachelors in Electronics and Communications Engineering @@ -34,9 +34,10 @@ Overview Initially Git used to only rely on shell scripts to test the functionality usually deemed to be internal (e.g. libraries, functions etc.) This was achieved -through t/helper binaries which would take input via stdin and output through -stdout and the test scripts would consume those outputs. This translation is -costly due to the following reasons: +through t/helper binaries which would take input via stdin or we can also invoke +it from the command line via arguments and output through stdout and the test +scripts would consume those outputs. This translation is costly due to the +following reasons: - Each test script creates a new git repository which is many times not needed. See here[1] for example. @@ -108,7 +109,9 @@ have made, in chronological order: This patch added functionality to assume an empty tree when on an unborn branch to be able to use ‘restore –staged’. However, this was discontinued due to lack of community interest. In hindsight, I think that this would - have been an acceptable patch, had I described it well. + have been an acceptable patch, had I described it well. However, I might + pick this up again during the empty period of present till the start of + GSoC. Mailing list thread: https://lore.kernel.org/git/20240206230357.1097505-2-shyamthakkar001@xxxxxxxxx/ @@ -250,6 +253,34 @@ per week during the Community Bonding and Coding Phase. This may fluctuate depending on the type of test. Some of those tests are already being worked on by Achu Luma. +- Tests which might not be suitable for migration: + As already mentioned in the Additional Notes section, some tests are not worth + migrating. This is because some of these helper binaries provide general + purpose utilities which are used in not just unit tests, but used elsewhere + as well. An example of that would be test-ref-store.c. Just by grepping for + ‘test-tool ref-store’, we can see that it is used in many test scripts which + also include end-to-end tests. More examples of these would be test-genzeros + and test-genrandom. + +- Tests which might be suitable for migration: + On the other hand, a suitable candidate for the migration would generally just + provide a certain output and that output is compared against an expected + value. These would not include utility helpers as mentioned above. In these + cases, the test script is usually just a layer to compare the output with the + expected value, and has little test related logic in itself. + test-example-decorate.c seems like a good straight-forward choice as their + “BUG” calls can be replaced by test_msg calls in the new framework and their + corresponding if conditions can become checks in the new framework. Also, this + helper is not used anywhere besides a single call in t9004. Some more examples + would be test-revision-walking.c (t0062), and while + test-{oidmap, oidtree, hashmap}.c can be considered a bit large, those are + very similar in nature to what I have already implemented for test-oidarray.c. + And the majority of these outlined tests have been stable for a long time and + have not had any meaningful change. Therefore, I would like to start with + these tests with mentors’ consensus. Some other tests can also be identified + with the help of mentors and those can also be done instead of the ones I + outlined. + ----Timeline---- Pre-GSoC @@ -294,8 +325,28 @@ have not setup the blog yet, it will be at: ----Availability---- My current semester ends on 30th April and my exam/viva is tentatively on 7th -May. After that I have freed my summer for GSoC, therefore I will be able to -give a minimum of 35-40 hours per week. +Mayand I will graduate after the results are out in June. Therefore I will be +able to give a minimum of 35-40 hours per week after 7th May as I have not taken +up any responsibilities in the summer for the purpose of GSoC. + +----Stretch Goals---- + +These goals may not be achieved during the GSoC period, however, these may come +in handy if the migration is done ahead of the schedule or it is determined that +the number of tests that need to be migrated are not sufficient to cover the +allocated GSoC period further down the line. These can also be done after the +GSoC period. (Note that these are just personal observations/ideas and may not +be accepted by mentors, hence '_Stretch_ Goals'.) +- Enhancement of the unit testing library. An example of the need of doing this + would be t-ctype, which bypasses top-level macros such as TEST(), in favor of + custom logic with the use of internal helper functions. +- Add more helper functions in the unit testing library such as handling and use + of parameters defined by environment variables (which I personally faced in + implementing t-oidarray.) I am sure there are other such utilities which might + be worthwhile. +- Adding support for performance related testing through unit tests. +- Working in other areas of the codebase not related to unit testing or doing + other tasks as specified by mentors. Post-GSoC --------- Here is the full proposal: -- >8 -- Move existing tests to a unit testing framework Personal Information -------------------- Name: Ghanshyam Thakkar E-mail: shyamthakkar001@xxxxxxxxx Ph. No.: +91 7990974044 Education: L.D. College of Engineering, Gujarat, India Year: IV (Final Year) Degree: Bachelors in Electronics and Communications Engineering GitHub: https://github.com/spectre10 Time-zone: UTC +5:30 (IST) Overview -------- Initially Git used to only rely on shell scripts to test the functionality usually deemed to be internal (e.g. libraries, functions etc.) This was achieved through t/helper binaries which would take input via stdin or we can also invoke it from the command line via arguments and output through stdout and the test scripts would consume those outputs. This translation is costly due to the following reasons: - Each test script creates a new git repository which is many times not needed. See here[1] for example. - These scripts spawn processes every time they need a certain output. And spawning processes are expensive (especially on windows). - Difficult setups: sometimes the functions or the libraries we are testing need some special setup such as defining flags for custom behavior etc. which cannot be done through the shell scripts, hence we might forego testing those conditions. Moving to the unit testing framework would give us many benefits such as increased runtime performance, and therefore lower test duration and solves all of the above points. The choice of framework and the rationale behind it is thoroughly described in Documentation/technical/unit-tests.txt. Therefore, the goal of this project is to move existing unit tests from t/helper/* and its corresponding scripts to the new unit testing framework. And there has been work going on to do this by Outreachy intern Achu Luma. The expected project difficulty is medium and the expected project size would be 350 hours. Pre-GSoC -------- I initially got into Git’s codebase in December, 2023 and started my contribution journey in January, 2024. I worked on mostly what I found interesting and within my reach. Following is the list of contributions that I have made, in chronological order: - [PATCH v6 0/2] t7501: add tests for --include, --only, --signoff Status: merged into master Merge Commit: 2be9ccf23e0592b83b78e719574e580e79db0375 Description: This patch series added missing tests for --include, --only and --signoff options of git-commit. While working on this patch I accidentally reproduced a bug where in an untracked path provided to the commit command with -i would not error out. This was noticed by Junio. Therefore, this patch also included a TODO comment describing this bug. Mailing list thread: https://lore.kernel.org/git/20240109060417.1144647-2-shyamthakkar001@xxxxxxxxx - [PATCH v3 1/2] t0024: avoid losing exit status to pipes Status: merged into master Merge Commit: 40225ba8b494080d7f90dd5e129daa7b11c613d1 Description: Although this was not my first contribution, it was done to fulfill the microproject criteria from the list of micro-projects on Git Developer Pages. The first patch replaces the pipe to the redirection operator to store the output to a temporary file to avoid losing exit status of the git commands. And the second patch of the series fixes the style violations to adhere to the current standard described in CodingGuidelines. Mailing list thread: https://lore.kernel.org/git/20240118215407.8609-1-shyamthakkar001@xxxxxxxxx/ - [PATCH v6 0/2] add-patch: classify '@' as a synonym for 'HEAD' Status: merged into master Merge Commit: 65462776c2963b377968e580ace1590238bf79c1 Description: The first patch of this series removed the disparity of command line output of using ‘HEAD’ vs ‘@’ in ‘git reset/restore/checkout -p’. And the second patch removed the perl prerequisites in the test scripts which use the patch mode functionality. Mailing list thread: https://lore.kernel.org/git/20240128181202.986753-2-shyamthakkar001@xxxxxxxxx/ - [PATCH] restore: allow --staged on unborn branch Status: Discontinued Description: This patch added functionality to assume an empty tree when on an unborn branch to be able to use ‘restore –staged’. However, this was discontinued due to lack of community interest. In hindsight, I think that this would have been an acceptable patch, had I described it well. However, I might pick this up again during the empty period of present till the start of GSoC. Mailing list thread: https://lore.kernel.org/git/20240206230357.1097505-2-shyamthakkar001@xxxxxxxxx/ - [PATCH] unit-tests: convert t/helper/test-oid-array.c to unit-tests Status: on hold until GSoC Description: This was used as a practice patch to the current project proposal. This will be used as a reference in the proposal ahead. Mailing list thread: https://lore.kernel.org/git/20240223193257.9222-1-shyamthakkar001@xxxxxxxxx/ - [PATCH v2] setup: remove unnecessary variable Status: Merged to master Merge Commit: 272fd9125aae215ece1c465a4b7d336cf81c3e62 Description: I initially proposed the TODO comment removed through this patch as a micro-project for my first patch. However, the TODO comment was a result of misunderstanding, therefore this patch did some minor code simplifications along with the removal of the TODO comment. Mailing list thread: https://lore.kernel.org/git/20240304151811.511780-1-shyamthakkar001@xxxxxxxxx/ - [PATCH 0/2] commit, add: error out when passing untracked path Status: WIP Description: This patch series would fix the potential bug discovered in the first patch that I sent (--include tests). Mailing list thread: https://lore.kernel.org/git/20240318155219.494206-2-shyamthakkar001@xxxxxxxxx/ - Helped a fellow contributor by reviewing their patches Thread links: - https://lore.kernel.org/git/CZG3HO25QLAG.3Q9V03SCO99HL@xxxxxxxxx/ - https://lore.kernel.org/git/CZHJPF604DV9.X0A0VX1AB7P8@xxxxxxxxx/ Proposed Project ---------------- ----What’s unit testing---- Unit testing is a method to test a specific unit of source code (i.e. a function, a module etc.). This is very helpful in catching all of the corner cases and possible scenarios. ----Steps to convert the existing unit tests to the new framework---- It can be thought of as a series of steps which are outlined below: - Identify a suitable candidate from t/helper directory. - Each t/helper unit test has a corresponding shell script which actually tests the output from the binary. Each test-case in that script would ideally become a separate function (and therefore a single test-case) in the migrated unit test. - Convert the test logic from the shell script to C. This would involve understanding the underlying library/functionality which is being tested. Subsequently, make functions which would be ideally be a single testcase. - These functions can be called via the TEST() macro in the cmd_main function which is the entry point to the test. - Each of these functions contains a set of {check(), check_int() etc.} macro functions which check the condition provided to them and if it fails, prints a message to the stdout. If even a single check fails the whole test-case is considered a failure. - Modify Makefile to add the reference of the new test to the UNIT_TEST_PROGRAMS variable. - The skeleton of the new unit test would look something like this: #include "test-lib.h" static void testcase_1() { ... check(actual == expected); ... } int cmd_main(int argc, const char **argv) { TEST(testcase_1(), "test description"); return test_done(); } - After the migration remove the references of t/helper/{old-test-helper}.{c, o} from the Makefile and from the t/helper/test-tool.{c, h}. And subsequently remove the shell script which used to consume the old helper binary. ----Additional Notes along with a Reference Implementation---- These tests usually contain duplicate logic with variations in input and output. In such cases, we can define macro functions to not duplicate such logic. I.e. in t-ctype, we can see how a custom macro is defined to avoid duplication of test logic, with variation only in the input being used. Also, some these existing tests are not worth migrating. For example, test-tool binaries which are also used outside of the corresponding shell script (i.e. used in other tests for setup purposes) would be low on the priority list of possible migrations. I have made a reference implementation of t-oid-array.c which migrates t/helper/test-oid-array.c and t/0064-oid-array.sh to the new framework, which can be viewed here[2]. Note that the failing tests are not related to my implementation and rather a known issue[3]. It may or may not to be good enough to be merged into master, however it sufficiently showcases my abilities to understand framework, the use of different macros and the ability to understand different internal libraries and data-structures such oidarray itself. This implementation was also reviewed by Christian Couder on the mailing list and I have made some adjustments according to the feedback. This implementation uses a macro to test the ordered enumeration of oidarray with different inputs. And also tests the lookup capabilities of oidarray with different queries and inputs with the help of a macro. There are also various helper functions defined to help with printing debug output, generating hexadecimal representations, etc. ----Previous Work---- TAP implementation along with t-basic and t-strbuf: https://github.com/git/git/commit/e137fe3b t-mempool: https://github.com/git/git/commit/6cbae640 t-ctype: https://github.com/git/git/commit/e875d451 t-prio-queue: https://github.com/git/git/commit/808b77e5 Achu Luma has been doing this migration as part of Outreachy internship. Their branches can be seen here[4]. ----Goal---- The end goal of this project would be to migrate as many unit tests as possible according to the steps outlined above. In doing so, also enhance some of the tests as needed to be extensible such as the ability to test different hash algorithms on a single oidarray. This enhancement is not yet implemented in the reference implementation due to the series which enables the use of different hash algorithms on oidarray not yet merged into master (in next though). Currently, there are 78 test-*.c files (not including test-tool.c) in t/helper/ directory according to the master branch. Of course, migrating all of them would be a difficult task in 350 hours, therefore I plan to migrate around 1 unit test per week during the Community Bonding and Coding Phase. This may fluctuate depending on the type of test. Some of those tests are already being worked on by Achu Luma. - Tests which might not be suitable for migration: As already mentioned in the Additional Notes section, some tests are not worth migrating. This is because some of these helper binaries provide general purpose utilities which are used in not just unit tests, but used elsewhere as well. An example of that would be test-ref-store.c. Just by grepping for ‘test-tool ref-store’, we can see that it is used in many test scripts which also include end-to-end tests. More examples of these would be test-genzeros and test-genrandom. - Tests which might be suitable for migration: On the other hand, a suitable candidate for the migration would generally just provide a certain output and that output is compared against an expected value. These would not include utility helpers as mentioned above. In these cases, the test script is usually just a layer to compare the output with the expected value, and has little test related logic in itself. test-example-decorate.c seems like a good straight-forward choice as their “BUG” calls can be replaced by test_msg calls in the new framework and their corresponding if conditions can become checks in the new framework. Also, this helper is not used anywhere besides a single call in t9004. Some more examples would be test-revision-walking.c (t0062), and while test-{oidmap, oidtree, hashmap}.c can be considered a bit large, those are very similar in nature to what I have already implemented for test-oidarray.c. And the majority of these outlined tests have been stable for a long time and have not had any meaningful change. Therefore, I would like to start with these tests with mentors’ consensus. Some other tests can also be identified with the help of mentors and those can also be done instead of the ones I outlined. ----Timeline---- Pre-GSoC (Until May 1) - Continue to work on different things like solving the bug discovered in the first patch series that I sent. Be engaged in the community. Community Bonding (May 1 - May 26) - Talk with mentors and identify the tests which are suitable for migration. Learn the libraries which are used for those tests and read up on the API. Look at Achu Luma’s branches to further familiarize myself with the migration process. Start the migration early with the mentors’ permission. Phase I (May 27 - July 11) - Continue implementing the tests in the unit testing framework with the guidance of mentors and send out patches to the mailing list for community approval. Approximately 6-7 tests would be done/cooking for migration at the end of this Phase. Phase II (July 12 - Aug 18) - Continue the migration and talk with the mentors about the pace, quality of patches and seek feedback. And enjoy the process. Another 5-6 tests would be done/cooking at the end of this Phase. Final Week (Aug 19 - Aug 26) - Finish up any remaining tests. - Fix any bugs and some final touches. - Make a final report about the work accomplished and outline future work. ----Blogging---- I believe that writing about one’s work helps them track their progress from a bird’s eye view. And also helps them realize if they are making mistakes from a long term perspective. I plan to blog about the progress bi-weekly. Although I have not setup the blog yet, it will be at: https://spectre10.github.io ----Availability---- My current semester ends on 30th April and my exam/viva is tentatively on 7th Mayand I will graduate after the results are out in June. Therefore I will be able to give a minimum of 35-40 hours per week after 7th May as I have not taken up any responsibilities in the summer for the purpose of GSoC. ----Stretch Goals---- These goals may not be achieved during the GSoC period, however, these may come in handy if the migration is done ahead of the schedule or it is determined that the number of tests that need to be migrated are not sufficient to cover the allocated GSoC period further down the line. These can also be done after the GSoC period. (Note that these are just personal observations/ideas and may not be accepted by mentors, hence '_Stretch_ Goals'.) - Enhancement of the unit testing library. An example of the need of doing this would be t-ctype, which bypasses top-level macros such as TEST(), in favor of custom logic with the use of internal helper functions. - Add more helper functions in the unit testing library such as handling and use of parameters defined by environment variables (which I personally faced in implementing t-oidarray.) I am sure there are other such utilities which might be worthwhile. - Adding support for performance related testing through unit tests. - Working in other areas of the codebase not related to unit testing or doing other tasks as specified by mentors. Post-GSoC --------- The Git Community has helped me very much and tolerated my beginner mistakes. I have grown as a developer since I started contributing to Git. I learned the importance of writing good commit messages, benefits of splitting up patches, effective programming style and good communication etiquettes. Therefore, I plan to be involved even after my GSoC journey ends. I am very much involved in the Git ecosystem and also contribute to another project in the Git ecosystem called gittuf[5]. Also post-gsoc, I am planning to help out in aggregating for Git Rev News, which I personally enjoy reading. My Appreciation --------------- I would like extend my thanks to all of the folks on the mailing list and Discord who have helped me in contributing to the Git Project including Junio, Christian Couder, Phillip Wood, Elijah Newren and many others. And I have also taken some references of Achu Luma’s work for this proposal, so thanks to her, too. Thanks & Regards, Ghanshyam References: [1], [2]: https://github.com/spectre10/git/commit/9bfbb22ced3c4d15a215a84aa40043a30dc1bac0 [3]: https://lore.kernel.org/git/20240318090848.GC602575@xxxxxxxxxxxxxxxxxxxxxxx/T/#mc32405849ac4265ad36670480709d744f1d9e495 [4]: https://github.com/achluma/git/branches [5]: https://github.com/gittuf/gittuf