Re: Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Glen!

Le 2022-05-06 à 17:50, Glen Choo a écrit :
> Philippe Blain <levraiphilippeblain@xxxxxxxxx> writes:
> 
>> Hi Huang,
>>
>> Le 2022-05-02 à 10:42, Huang Zou a écrit :
>>> Thank you for filling out a Git bug report!
>>> Please answer the following questions to help us understand your issue.
>>>
>>> What did you do before the bug happened? (Steps to reproduce your issue)
>>> 1) Set the following configs:
>>>  git config submodule.recurse true
>>>  git config fetch.recurseSubmodules false
>>> 2) On a repo with submodules, run:
>>> git pull
>>>
>>> What did you expect to happen? (Expected behavior)
>>> git pull doesn't recursively fetch submodules
>>>
>>> What happened instead? (Actual behavior)
>>> Submodules are fetched recursively
>>>
>>> What's different between what you expected and what actually happened?
>>> Submodules are fetched recursively
>>>
>>> Anything else you want to add:
>>> git fetch works as intended. The documentation for fetch.recurseSubmodules
>>> states that "This option controls whether git fetch (and the underlying
>>> fetch in git pull)" so I would naturally expect git pull to behave the same
>>> as git fetch
>>
>> I did not try to reproduce, but I took a look at the code and I think I understand
>> what happens. 
>>
>> When 'git pull' invokes 'git fetch', it does so by specifically using the '--recurse-submodules'
>> flag, see [1]. It sends either 'yes', 'no' or 'on-demand' as value, depending on the value
>> of the local variable 'recurse_submodules'. This variable is initialized to the config value
>> of 'submodule.recurse' in 'git_pull_config' [2], called at [3], and then overwritten by the value given
>> explicitely on the command line [4], parsed at [5].
>>
>> So when 'git fetch' runs when called by 'git pull', it always receive the 
>> '--recurse-submodules' flag, and thus any config for fetch.recurseSubmodules is ignored
>> (since explicit command line flags always have precedence over config values).
> 
> Thanks for looking into this! This seems to agree with my reading of the
> code. I haven't tried to reproduce it either, but the code looks
> obviously incorrect.
> 
>> So one way to fix this would be to also parse 'fetch.recurseSubmodules' in 'git_pull_config',
>> and send the correct value to the 'git fetch' invocation... Or simpler, call 'git fetch' with
>> '--recurse-submodules-default' [9] instead...
> 
> Despite having touched this code fairly recently, I had to do quite a
> rereading to refresh myself on how this works. If I _am_ reading this
> correctly, then I think we actually want to set `--recurse-submodules`
> and not `--recurse-submodules-default`.
> 
> The short story is that the two are not equivalent - when figuring out
> _which_ submodules to fetch (we determine on a submodule-by-submodule
> basis; we don't just say "should we fetch all submodules?"),
> `--recurse-submodules-default` gets overrided by config values, but
> `--recurse-submodules` does not.
> 
> The longer story (which I think is quite difficult to explain, I am also
> a little confused) is that in a recursive fetch,
> `--recurse-submodules-default` is the value of the parent's (we'll call
> it P) `--recurse-submodules`. This only matters when a process, C1, has
> to pass a value for `--recurse-submodules` to its child, C2. The
> precedence order is:
> 
> - C1's --recurse-submodules | fetch.recurseSubmodules |
>   submodule.recurse
> - C2's submodule entry in C1's .git/config
> - C2's entry in C1's .gitmodules
> - C1's --recurse-submodules-default (aka P's --recurse-submodules)
> 
> Specifically, in code:
> 
>   static int get_fetch_recurse_config(const struct submodule *submodule,
>               struct submodule_parallel_fetch *spf)
>   {
>     // Glen: passed in from builtin/fetch, which parses
>     //  --recurse-submodules, fetch.recurseSubmodules, submodule.recurse
>     if (spf->command_line_option != RECURSE_SUBMODULES_DEFAULT)
>       return spf->command_line_option;
> 
>     if (submodule) {
>       // ...
>       // Glen: fetch configuration from .gitmodules
>       int fetch_recurse = submodule->fetch_recurse;
> 
>       key = xstrfmt("submodule.%s.fetchRecurseSubmodules", submodule->name);
>       if (!repo_config_get_string_tmp(spf->r, key, &value)) {
>         // Glen: fetch configuration from .git/config
>         fetch_recurse = parse_fetch_recurse_submodules_arg(key, value);
>       }
>       // ...
>     }
> 
>     // Glen: --recurse-submodules-default
>     return spf->default_option;
>   }
> 
> So `--recurse-submodules-default` really wasn't meant for anything other
> than "fetch" invoking itself in a superproject-submodule setting.
> 
> Of course, I could be entirely wrong and I should just write up a test
> case :). I hope to send one soon.

OK so it's more complicated than I thought (I'm not surprised :P). Thanks for 
the details.

The other thing I thought about, is that even if '--recurse-submodules-default'
worked as I thought it did when I wrote this, it would still not be the right
thing to change builtin/pull to invoke 'git fetch' with it instead, since then
a user invoking e.g. 'git pull --recurse-submodules=false', who has 'fetch.recurseSubmodules=true',
would not get a recursive fetch, since the internal fetch would prioritise 'fetch.recurseSubmodules=true',
but that's contrary to user expectation that command-line flags override config, always...
So whatever is done to fix this, we might need to keep track of where 'recurse_submodules'
was set from, either the command-line or the config.

> 
>> [sidenote]
>> I'm thought for a while that it was maybe not a good idea to change the behaviour
>> in your specific situation. If you have 'submodule.recurse'
>> set to true and 'fetch.recurseSubmodules' set to false, and if the code is changed so that indeed
>> 'git pull' does not fetch recursively, then the code will still try to update the submodule working
>> trees after the end of the operation (merge or rebase), see the end of 'cmd_pull' [6], [7]. This  is
>> OK, because if there are new submodule commits referenced by the superproject and they were not fetched because the 
>> fetch was not recursive, then the call to 'git submodule update' in update_submodules/rebase_submodule
>> should fetch them, so no problem there.
>> [/sidenote]
> 
> I think the bigger question to ask is "what is the intended effect of
> 'submodule.recurse = true' and 'fetch.recurseSubmodules = false'?".

Yes, I agree that it would be nice if Huang clarified that as I'm not sure
either of the use case.

> I
> think it is not surprising to think that recursive worktree operations
> might fail if the submodule commits weren't fetched.

Yes, if ever 'merge' and 'rebase' are taught to obey 'submodule.recurse'
(if only I had time to work on that!), then all hell would break loose
if submodule commits were not fetched when these operations start.

> 
> Perhaps this is just a performance optimization? i.e. after fetching in
> the superproject, the user wants to skip the rev walk that discovers new
> submodule commits.
> 


Cheers,
Philippe.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux