> -----Original Message----- > From: Stefan Beller [mailto:sbeller@xxxxxxxxxx] > Sent: Friday, October 07, 2016 2:56 PM > To: David Turner > Cc: git@xxxxxxxxxxxxxxx > Subject: Re: Uninitialized submodules as symlinks > > On Fri, Oct 7, 2016 at 11:17 AM, David Turner <David.Turner@xxxxxxxxxxxx> > wrote: > > Presently, uninitialized submodules are materialized in the working tree > > as empty directories. > > Right, there has to be something, to hint at the user that creating a file > with that path is probably not what they want. > > > We would like to consider having them be symlinks. Specifically, we'd > > like them to be symlinks into a FUSE filesystem which retrieves files on > > demand. > > > > We've actually already got a FUSE filesystem written, but we use a > > different (semi-manual) means to connect it to the initialized submodules. > > So you currently do a > > git submodule init <pathspec> > custom-submodule make-symlink <pathspec> > > ? We do something like For each initialized submodule: symlink it into the right place in .../somedir For each uninitialized submodule: symlink from the FUSE into the right place in .../somedir So .../somedir has the structure of the git main repo, but is all symlinks -- some into FUSE, some into the git repo. This means that when we initialize (or deinitialize) a submodule, we need to re-run the linking script. > > We hope to release this FUSE filesystem as free software at some point > > soon, but we do not yet have a fixed schedule for doing so. Having to run > > a command to create the symlink-based "union" filesystem is not optimal > > (since we have to re-run it every time we initialize or deinitialize a > > submodule). > > > > But if the uninitialized submodules could be symlinks into the FUSE > > filesystem, we wouldn't have this problem. This solution isn't > > necessarily FUSE-specific -- perhaps someone would want copies of the same > > submodule in multiple repos, and would want to save disk space by having > > all copies point to the same place. So the symlinks would be configured > > by a per-submodule config variable. > > I'd imagine that you want both a per-submodule config variable as well as > a global variable that is a default for all submodules? > > git config submodule.trySymlinkDefault /mounted/fuse/ > # any (new) submodule tries to be linked to /mounted/fuse/<path> > git config submodule.<name>.symlinked ~/my/private/symlinked > # The <name> submodule goes into another path. > > As you propose the FUSE filesystem fetches files on demand, you probably > want to disable things that scan the whole submodule, e.g. look at > submodule.<name>.ignore to suppress status looking at all files. I would actually expect that git would detect that the symlink is unmodified from the configured symlink and automatically decide not to look there. > When looking through the options, you could add the value "symlink" to > submodule.<name>.update, which then respects the > submodule.trySymlinkDefault if present, such that > > git clone --recurse-submodules ... > > works and sets up the FUSE thing correctly. > > How does the FUSE system handle different versions, i.e. > `git submodule update` to checkout another version of the submodule? > (btw, I plan on working on integrating submodules to "git checkout", so > "submodule update" would not need to be run there, but we'd hook it into > checkout instead) The fuse has a (virtual) directory for each SHA of the main repo, with each submodule mapped to the then-current version of the submodule's code. Actually, it's a bit more complicated because the uninitialized modules point to already-built binaries -- that is, the symlink is to something like $fuse/$SHA/built/$submodule. If you check out a new version of the main module, in our current setup, you need to again update all of the submodule symlinks (as described above). Under my proposal, I guess this would still need to happen. A post-checkout hook could handle it either way. Despite this flaw, switching a submodule between an initialized and deinitialized state would still be more seamless with the symlinks. > > Naturally, this would require some changes to code that examines the > working tree -- git status, git diff, etc. They would have to report > "unchanged" for submodules which were still symlinks to the configured > location. I have not yet looked at the implementation details beyond > this. > > > > Does this idea make any sense? If I were to implement it (probably in a > few months, but no official timeline yet), would patches be considered? > > I am happy to review patches. Thanks.