Re: [GSoC] Draft Proposal (Convert submodule to builtin)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Apr 3, 2021 at 4:13 PM Chinmoy Chakraborty <chinmoy12c@xxxxxxxxx> wrote:
>
> Here is a textual format of the draft proposal.
>
>
> Convert submodule to builtin
> March 2021
>
> ==================================================
> ##Personal Information##
>
>
> Name - Chinmoy Chakraborty
> E-mail - chinmoy12c@xxxxxxxxx
> Github - https://github.com/chinmoy12c
> Linkedin - https://www.linkedin.com/in/chinmoy12c/
> Major - Information Technology
> Time Zone - IST (UTC+05:30)
> =================================================
>
> ##Work Environment##
>
> I am fluent in C, Java, Python, and Shell script. I use Git as my VCS,
> Visual Studio Code
> as my primary code editor, and Kali Linux as my primary OS.
> =================================================
>
> ##Git Contributions##
>
> [Microproject] Replace instances of `the_repository` with ‘r’. (Learning
> the ropes)
> Pull request: https://github.com/gitgitgadget/git/pull/915
> Mailing List:
> https://lore.kernel.org/git/pull.915.git.1616701733901.gitgitgadget@xxxxxxxxx/
>
>
> [column, range-diff] downcase option description
> Pull request: https://github.com/gitgitgadget/git/pull/920
> Mailing List:
> https://lore.kernel.org/git/pull.920.git.1616913298624.gitgitgadget@xxxxxxxxx/
>
>
> [Documentation] updated documentation for git commit --date
> Pull request: https://github.com/gitgitgadget/git/pull/918
> Mailing List:
> https://lore.kernel.org/git/pull.918.git.1616926790227.gitgitgadget@xxxxxxxxx/
> =================================================

Ok.

> ##Project Outline##
>
> A few components of git, like `git-submodule.sh`
> are in the form of shell scripts. This causes
> problems in production code in multiple platforms
> like windows. The goal of this project is to
> convert the shell script version of `git-submodule.sh`
> to portable c code. The end goal would be

s/c/C/

> to completely remove `git-submodule.sh` and rename
> `builtin/submodule--helper.c` to `builtin/submodule.c`.

You could add something like "so that the `git submodule` is fully
implemented using C".

> =================================================
>
> ##Why is the project required?##
>
> "Issues with the portability of code"
>
> The submodule shell script uses shell commands like
> `echo`, `grep`, `test`, `printf` etc. When switching
> to non-POSIX compliant systems, one will have
> to re-implement these commands specifically for the
> system. There are also POSIX-to-Windows path conversion
> issues. To fix these issues, it was decided to convert
> these scripts into portable C code.
>
> "Large overhead in calling the command"
>
> The commands implemented in shell scripts are not builtins, so
> they call `fork()` and `exec()` multiple times, hence creating
> additional shells. This adds to the overhead in using the
> commands in terms of time and memory.
>
> "No access to low-level API"
>
> The shell commands don’t have access to low level commands
> like `git hash-object`, `git cat-file` etc. As these commands
> are internally required for submodule commands to work, the shell
> script needs to spawn a separate shell to execute these commands.
> =================================================

Ok.

> ##How have I prepared?##
>
> I have gone through all the previous works and read through their
> code to make myself accustomed to the intricacies of the code.
> I have also structured my workflow based on the observation of
> the previous discussions on those patches, and taken into
> consideration the issues faced previously.
>
> =================================================
>
> ##Previous Work##
>
> A large part of the `git submodule--helper.c` has already been
> converted by Stefan Beller, Prathamesh Chavan in his GSoC project
> in 2017, and Shourya Shukla in his GSoC project in 2020. This is
> the list of already ported commands.
>
> set-branch
> set-url
> summary
> status
> init
> deinit
> update
> foreach
> sync
> absorbgitdirs
> =================================================
>
> ##Work to be done##
>
> The only command that is left to be ported is `git submodule add`.
> The previous work on this by Shourya Shukla in GSoC 2020, did
> not reach a successful merge due to some issues in design and
> was kicked out because it has been stale for so long.

Maybe you could explain a bit what "kicked out" means, or replace it
with something more explicit, for people who don't know well how Git
development works.

> The first
> and foremost aim of the project will be to finish porting this
> command.

You mean the "add" sub-command, or the full "submodule" command?

> Thereafter, the end goal would be to completely replace
> the shell script (git-submodule.sh) with an efficient c code.

s/c/C/

> Before porting the `git submodule add` command the initial work
> would be dedicated to the implementation of small helper functions
> in the form of small patches, which would be directly used by the
> `add` command. This workflow is based on the suggestion by
> Junio C Hamano on the thread:
> https://lore.kernel.org/git/xmqqd01sugrg.fsf@xxxxxxxxxxxxxxxxxxxxxx/.
>
> This workflow would help in the following ways:
>
> - It would help in sending patches in a small digestible format.
> - It would help the reviewers easily review those small units
>    of patches in a single sitting.
> - It would help keep small logical units of code in different clean commits.

Yeah, nice!

How does this compare with Shourya's work? Will this avoid the design
issues in Shourya's work?

> An additional test tweak would also be required in
> `t7400-submodule-basic.sh`,
> to prepend the keyword ‘fatal’ since the command dies out in case
> of absence of commits.
>
>
> The following helper functions would be required to be implemented -
>
> - A function to guess the directory name from the repository string.
> - A function for normalizing path, that is, removing multiple
>    //; leading ./; /./; /../; trailing / .
>
> - A function to check for tracked directories properly as pointed
>    out by Kaartic Sivaraam on the thread:
> https://lore.kernel.org/git/ce151a1408291bb0991ce89459e36ee13ccdfa52.camel@xxxxxxxxx/.
>
> - A function to check if the path exists and is already a git
>    repo, else clone it.
>
> - A function to set the submodule config properly.

Nice! Maybe you could give an example and tell:
- how you would name one of the above function,
- what would be its arguments,
- perhaps how you would test it
...

> - After implementation of all these helper methods, the main
>    `module_add()` function would be implemented using the helper
>    functions listed above as well as those helper functions which
>    are predefined.
> =================================================

Ok, I will review the other parts later.

Thanks!




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux