Hi Shourya, just adding a little to what Abhishek said (which was pretty sound advice!) below. On Sun, 9 Feb 2020, Shourya Shukla wrote: > I am facing some problems and would love some insight on them: > > 1. What exactly are we aiming in [3]? To replace the function completely > or to just add some 'repo_submodule_init' functionality? If you follow the "Git blame" link in the breadcrumb menu, you will get to the commit that added the TODO: https://github.com/periperidip/git/commit/18cfc0886617e28fb6d29d579bec0ffcdb439196 Unfortunately, it does not necessarily help me understand what that TODO is about. So let's analyze the code: int add_submodule_odb(const char *path) { struct strbuf objects_directory = STRBUF_INIT; int ret = 0; ret = strbuf_git_path_submodule(&objects_directory, path, "objects/"); if (ret) goto done; if (!is_directory(objects_directory.buf)) { ret = -1; goto done; } add_to_alternates_memory(objects_directory.buf); done: strbuf_release(&objects_directory); return ret; } Okay, so this just adds the object database of the submodule (if it exists, if it does not exist, the submodule is probably _already_ using the superproject's database). To understand what I am talking about, have a look at this document: https://git-scm.com/docs/gitrepository-layout#Documentation/gitrepository-layout.txt-objects So what does the function do that was suggested as a better alternative? int repo_submodule_init(struct repository *subrepo, struct repository *superproject, const struct submodule *sub) { struct strbuf gitdir = STRBUF_INIT; struct strbuf worktree = STRBUF_INIT; int ret = 0; if (!sub) { ret = -1; goto out; } strbuf_repo_worktree_path(&gitdir, superproject, "%s/.git", sub->path); strbuf_repo_worktree_path(&worktree, superproject, "%s", sub->path); if (repo_init(subrepo, gitdir.buf, worktree.buf)) { /* * If initialization fails then it may be due to the * submodule * not being populated in the superproject's worktree. * Instead * we can try to initialize the submodule by finding it's * gitdir * in the superproject's 'modules' directory. In this * case the * submodule would not have a worktree. */ strbuf_reset(&gitdir); strbuf_repo_git_path(&gitdir, superproject, "modules/%s", sub->name); if (repo_init(subrepo, gitdir.buf, NULL)) { ret = -1; goto out; } } subrepo->submodule_prefix = xstrfmt("%s%s/", superproject->submodule_prefix ? superproject->submodule_prefix : "", sub->path); out: strbuf_release(&gitdir); strbuf_release(&worktree); return ret; } Ah, that populates a complete `struct repository`! I fear, however, that our object lookup is currently not tied to such a `struct repository` instance. So I think that this TODO can only be addressed once a ton more patch series like https://lore.kernel.org/git/f1e4da02-9411-8a93-ca62-6d7ae7bf4ae8@xxxxxxxxx/ made it not only to the Git mailing list, but into `master`. > 2. Something I inferred was that functions with names of the pattern 'strbuf_git_*' > are trying to 'create a path'(are they physically creating the path or just > instructing git about them?) while functions of the pattern 'git_*' are trying > to check some conditions denoted by their function names(for instance > 'git_config_rename_section_in_file')? Is this inference correct to some extent? All `strbuf_*()` functions work on our "string class" (I forgot who said it, but it is true that any sufficiently advanced C project sooner or later develops their own string data type). To know whether the functions in question create a path or not, you will have to find their documentation in the appropriate header file (usually `strbuf.h`), or absent that, find and understand their implementation (usually in `strbuf.c`). > 3. How does one check which all parts of a command have been completed? Is it checked > by looking at the file history or by comparing with the shell script of the command > or are there any other means? You mean whether a scripted command has been completely converted to C? There is no universal way to do that. In `git submodule`'s instance, I would say that a subcommand is converted successfully when all parts except for the command-line option parsing have been moved into the `submodule--helper`. Eventually, `git-submodule.sh` will only have functions that parse command-line options and then pass the result on to the helper. At that point, the command-line option parsing can _also_ be moved into the helper. Or maybe even the entire script in one go, I am not sure how big of a patch that would be. > 4. Is it fine if I am not able to understand the purpose of certain functions right now(such as > 'add_submodule_odb')? I am able to get a rough idea of what the functions are doing but I am > not able to decode certain functions line-by-line. It is okay not to understand all the details, but if you want to work on the code, you will need to understand at least the purpose, and if you want to come up with a project plan (e.g. for GSoC), it will be _really_ helpful to form an understanding of the implementation details, too. > Currently, I am studying in depth about 'git objects' and the submodule command on the git Documentation. > What else do would you advise me to strengthen my understanding of the code and git in general? I don't know what in particular you want to strengthen. Typically, a good way to learn enough about the code base in preparation for Google Summer of Code or Outreachy is to read the code, and whenever anything is unclear, try to learn about the data structures and/or the underlying design by studying the files in `Documentation/` (in particular in the `technical/` subdirectory) whose names seem relevant. Ciao, Johannes > > Regards, > Shourya Shukla > > [1]: https://github.com/periperidip/git/blob/v2.25.0/submodule.c > [2]: https://lore.kernel.org/git/20200201173841.13760-1-shouryashukla.oo@xxxxxxxxx/ > [3]: https://github.com/periperidip/git/blob/v2.25.0/submodule.c#L168 > >