Re: in case you want a use-case with lots of submodules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 19, 2017 at 1:20 PM, Yaroslav Halchenko <yoh@xxxxxxxxxxxxxx> wrote:
>
> On Mon, 19 Jun 2017, Stefan Beller wrote:
>
>> On Mon, Jun 19, 2017 at 8:59 AM, Yaroslav Halchenko <yoh@xxxxxxxxxxxxxx> wrote:
>> > Hi All,
>
>> > On a recent trip I've listened to the git minutes podcast episode and
>> > got excited to hear  Stefan Beller (CCed just in case) describing
>> > ongoing work on submodules mechanism.  I got excited, since e.g.
>> > performance improvements would be of great benefit to us too.
>
>> If you're mostly interested in performance improvements of the status
>> quo (i.e. "make git-submodule fast"), then the work of Prathamesh
>> Chavan (cc'd) might be more interesting to you than what I do.
>> He is porting git-submodule (which is mostly a shell script nowadays)
>> to C, such that we can save a lot of process invocations and can do
>> processing within one process.
>
> ah -- cool.  I would be eager to test it out, thanks!  would be
> interesting to see if it positively affects our overall performance.
> Pointers to that development would be welcome!

The latest from today:
https://public-inbox.org/git/CAME+mvUQJFneV7b1G7zmAidP-5L=nimvY43V0ug-Gtesr83tzg@xxxxxxxxxxxxxx/


>
>> > http://datasets.datalad.org ATM provides quite a sizeable (ATM 370
>> > repositories, up to 4 levels deep) hierarchy of git/git-annex
>> > repositories all tied together via git submodules mechanism.  And as the
>> > collection grows, interactions with it become slower, so additional
>> > options (such as --ignore-submodules=dirty  to status) become our
>> > friends.
>
>> I am not as much concerned about the 370 number than about the
>> 4 layers of nesting. In my experience the nested submodule case
>> is a little bit error prone and the bug reports are not as frequent as
>> there are not as many users of nesting, yet(?)
>
> well -- part of the story here is that we are forced to use/have full
> blown .git/ directories (for git-annex symlinks to content files to
> work) within submodules instead of .git file with a reference under
> parent's .git/modules.   So we can 'slice' at any level and I
> guess that is why may be avoiding some possibly issues due to nesting
> and the "parent has all .git/modules" approach.

That sounds like you either want to configure to have the submodules
git dirs in-place or you want to convince git-annex to learn about the
gitdir pointer files.

>
>> In a neighboring thread on the mailing list we have a discussion
>> on the usefulness of being on branches than in detached HEAD
>> in the submodules.
>> https://public-inbox.org/git/0092CDD27C5F9D418B0F3E9B5D05BE08010287DF@xxxxxxxxxxxxxxxxxxxxxxxx/
>
>> This would not break non-ambiguously, rather it would add
>> ease of use.
>
> that is indeed a common caveat... I am not sure if any heuristic
> approach would provide a 'bullet proof' solution.  I might even prefer a
> hardcoded 'branch-name' to be listed/associated with each submodule
> within .gitmodules.

hardcoded as submodule.NAME.branch, maybe?
https://git-scm.com/docs/gitmodules

>  In the datalad case, detached HEAD is common

So you are accustomed to detached HEADs and would not
gain much from being back on a branch?  That's cool, too.


> whenever someone installs "outdated" (branch of which progressed
> forward) submodule.  In this case we just check if the branch after "git
> clone"  (but before git submodule update) includes the pointed by
> Subproject commit, and if so -- we announce that it must be the branch
> (so far it is always "master" branch anyways ;) )

heh, having just one branch. That is retro-style. :)



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux