Re: [FR] supporting submodules with alternate version control systems (new contributor)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/05/2022 18:20, Jason Pyeron wrote:
-----Original Message-----
From: Junio C Hamano
Sent: Tuesday, May 10, 2022 1:01 PM
To: Addison Klinke <addison@xxxxxxxxx>

Addison Klinke <addison@xxxxxxxxx> writes:

Is something along these lines feasible?
Offhand, I only think of one thing that could make it fundamentally
infeasible.

When you bind an external repository (be it stored in Git or
somebody else's system) as a submodule, each commit in the
superproject records which exact commit in the submodule is used
with the rest of the superproject tree.  And that is done by
recording the object name of the commit in the submodule.

What it means for the foreign system that wants to "plug into" a
superproject in Git as a submodule?  It is required to do two
things:

  * At the time "git commit" is run at the superproject level, the
    foreign system has to be able to say "the version I have to be
    used in the context of this superproject commit is X", with X
    that somehow can be stored in the superproject's tree object
    (which is sized 20-byte for SHA-1 repositories; in SHA-256
    repositories, it is a bit wider).

  * At the time "git chekcout" is run at the superproject level, the
    superproject will learn the above X (i.e. the version of the
    submodule that goes with the version of the superproject being
    checked out).  The foreign system has to be able to perform a
    "checkout" given that X.

If a foreign system cannot do the above two, then it fundamentally
would be incapable of participating in such a "superproject and
submodule" relationship.

The sub-modules already have that problem if the user forgets publish their sub-module (see notes in the docs ;-).
The submodule "type" could create an object (hashed and stored) that contains the needed "translation" details. The object would be hashed using SHA1 or SHA256 depending on the git config. The format of the object's contents would be defined by the submodule's "code".

Another way of looking at the issue is via a variant of Git-LFS with a smudge/clean style filter. I.e. the DataVCS would be treated as a 'file'.

The LFS already uses the .gitattributes to define a 'type', while the submodules don't yet have that capability. There is just a single special type within a tree object of "sub-module"  being a mode 16000 commit (see https://longair.net/blog/2010/06/02/git-submodules-explained/).

One thought is that one uses a proper sub-module that within it then has the single 'large' file git-lfs style that hosts the hash reference for the data VCS (https://github.com/git-lfs/git-lfs/blob/main/docs/spec.md). It would be the regular sub-modules .gitattributes file that handles the data conversion.

It may be converting an X-Y problem into an X-Y-Z solution, or just extending the problem.

--
Philip





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux