Re: [RFC PATCH 0/5] recursively grep across submodules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stefan Beller <sbeller@xxxxxxxxxx> writes:

>> Just a few brief comments, before reading the patches carefully.
>>
>>  * It is somewhat surprising that [1/5] is even needed (in other
>>    words, I would have expected something like this to be already
>>    there, and my knee-jerk reaction was "Heh, how does 'git status'
>>    know how to show submodules that are and are not initialized
>>    differently without this?"
>
> The issue with much of the existing code is that it is submodule centric,
> i.e. it is written to not care about the rest.
>
> git status for example just calls "git submodule summary" to
> parse and display the submodule information additionally.
> It doesn't integrate submodules and treats them "just like files".

Oh, I know all that after/while writing the above "it is somewhat
surprising" and reading what wt-status.c does.  It was just that it
was somewhat surprising ;-)

> My reaction to 1/5 was that the implementation is sound,
> but the design may need rethinking.
>
> Instead of asking all these question, "Is a submodule
> * initialized
> * checked out (== have a working dir)
> * have a .git dir (think of deleted submodules that keep the
>   historical git dir around)
> (* have commit X)
> we would want to either extend the submodule-config API
> to also carry these informations just like
> name/path/sha1/url/shallow clone recommendation.

I think you are going in a wrong direction with all the above.

Unless you are imagining "git grep" to initialize and checkout a
submodule that is not checked out on-demand, I do not think you have
any reason to even look at ".gitmodules" for the purpose of "I want
to grep both in superproject and submodules that are checked out."

You only need to detect .gitlink that exists in the index of the
superproject, and then there would be only two cases:

 * $path has an empty directory (not even .git in there).  The user
   is not interested in that submodule.

 * $path has ".git", either a directory (old layout or we are
   dealing with the repository that originated the submodule) or a
   "gitdir:" file that points into .git/modules of the repository of
   the superproject.  The user is interested in the submodule.

If $path has ".git" and nothing else, the only explanation is that
the user removed the working tree files in the submodule.  If your
grep is looking at working tree files, it is correct not to find
anything in there.  If it is working with "--cached", go look at the
index of the submodule repository (either the ".git" directory, or
the stashed-away repository in .git/modules/ in the superproject).
If it is working with a tree-ish, again, go look at the object store
in that submodule repository.

>>  * It is somewhat surprising that [4/5] does not even use the
>>    previous ls-files to find out the paths.  Also it is a bit
>>    disappointing to see that the way processes are spawned and
>>    managed does not share much with Stefan's earlier work, i.e.
>>    run_processes_parallel().  I was somehow hoping that it can be
>>    extended to support this use case, but apparently there aren't
>>    much to be shared.
>
> I think there are 2 issues here:

There is no issue here.  I was just giving my impressions (i.e.
"somewhat surprising").

> * git-grep already has its own thread pool

I know.  I was expecting that the previous "ls-files" that recurses
will be used to feed into that thread pool, but I didn't find that
in my cursory look at the patch, hence "somewhat surprising".

I hate it when people become overly defensive and start making
excuses when given harmless observations.






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]