Re: Problem with --shallow-submodules option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Thanks for the clarification, it makes sense now.

Thanks,
    Istvan


On 30 June 2016 at 22:57, Stefan Beller <sbeller@xxxxxxxxxx> wrote:
> On Thu, Jun 30, 2016 at 6:27 AM, Istvan Zakar <istvan.zakar@xxxxxxxxx> wrote:
>> Hello,
>>
>> Thanks for your answers. I tested it after the changes were made on
>> the git server, and it seems to be working. But some other issue came
>> up.
>>
>> We have quite many submodules in our project so I did some comaprision:
>>
>> If I do a clone with these parameters:
>> --jobs 20 --recurse-submodules
>>
>> The clone lasts ~53 seconds, and the total size of the folder is around 2 GB.
>>
>> If I add the shallow-submodules option, the size of the folder will be
>> a bit below 1GB, so the size decreased as I expected, but the time of
>> the clone itself increased to 90 seconds. It seems the last step of
>> the command, checking out the submodules is executed one-by-one, and
>> not in parallel, so it seems at this step the jobs parameter does not
>> have effect.
>>
>> Is it intentional, or there is some option I missed?
>
> It was intentional at the time of submitting the patches.
> The checkout phase is a bit complicated as it combines the
> newly cloned submodules as well as the submodules to incrementally
> fetch into one bucket and treats them the same.
>
> And for submodules that were fetched incrementally you may run into problems
> when combining that with the local state (e.g. rebase or merge configured in
> `submodule.<name>.update` or passed on the command line), which requires
> human interaction (resolving the merge conflict), which we want to present one
> at a time to the user.
>
> The handling for the user is not quite clear, when to stop, see:
> 15ffb7cde48b73b3d5ce259443db7d2e0ba13750 (submodule update: continue
> when a checkout fails)
> 877449c136539cf8b9b4ed9cfe33a796b7b93f93 (git-submodule.sh: clarify
> the "should we die now" logic)
>
> So we want to die as soon as we see a merge conflict or other
> error that is likely to require some human interaction.
> To do that properly we need to have complicated logic or just update
> one submodule at a time.
>
> For initial checkouts we know that there will be no merge conflicts, i.e.
> it will be a "checkout -f" (with an implicit must_die_on_failure=no)
> So we could run all checkouts of submodules in parallel, too. We'd
> just need to write the patch for that.
>
> As the cloning is already done in parallel, we can hook into the initial
> checkout there easily. I'd build that on top of [1], creating a similar commit.
> In the successful case of `update_clone_task_finished` (the case with
> `!result`  -> return 0;) we would need to add the checkout command to
> the queue instead of just finishing.
>
> [1] https://github.com/gitster/git/commit/665b35eccd39fefd714cb5c332277a6b94fd9386
>
>
>>
>> I'm using git 2.9.0 on client side.
>>
>> Thanks,
>>    Istvan
>>
>> ps: if I update the submodules with --depth 1 parameter in parallel
>> using xargs it lasts about 18 seconds, so it's a workaround for this
>> issue, but it would be nice to do it with a single command.
>>
>>
>>
>>
>> On 22 June 2016 at 17:31, Fredrik Gustafsson <iveqy@xxxxxxxxx> wrote:
>>> On Mon, Jun 20, 2016 at 01:06:39PM +0000, Istvan Zakar wrote:
>>>> I'm working on a relatively big project with many submodules. During
>>>> cloning for testing I tried to decrease the amount of data need to be
>>>> fetched from the server by using --shallow-submodules option in the clone
>>>> command. It seems to check out the tip of the remote repo, and if it's not
>>>> the commit registered in the superproject the submodule update fails
>>>> (obviously). Can I somehow tell to fetch that exact commit I need for my
>>>> superproject?
>>>
>>> Maybe. http://stackoverflow.com/questions/2144406/git-shallow-submodules
>>> gives a good overview of this problem.
>>>
>>> git fetches a branch and is shallow from that branch, which might be an
>>> other sha1 than the one the submodule points to, (as you say). This
>>> is/was one of the drawbacks with this method. However the since git 2.8,
>>> git will try to fetch the sha1 direct (and not the branch). So then it
>>> will work, if(!), the server supports direct access to sha1. This was
>>> previously not allowed due to security concerns (if I recall correctly).
>>>
>>> So the answer is, yes this will work if you've a recent version of git
>>> and support on the server side for doing this. Unfortunately I'm not
>>> sure which git version is needed on the server side for this to work.
>>>
>>> --
>>> Fredrik Gustafsson
>>>
>>> phone: +46 733-608274
>>> e-mail: iveqy@xxxxxxxxx
>>> website: http://www.iveqy.com
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]