Re: [WIP/RFC 00/23] repository object

Duy Nguyen <pclouds@xxxxxxxxx> · Mon, 29 May 2017 17:36:37 +0700

On Tue, May 23, 2017 at 2:35 AM, Jeff King <peff@xxxxxxxx> wrote:
> On Thu, May 18, 2017 at 04:21:11PM -0700, Brandon Williams wrote:
>
>> When I first started working on the git project I found it very difficult to
>> understand parts of the code base because of the inherently global nature of
>> our code.  It also made working on submodules very difficult.  Since we can
>> only open up a single repository per process, you need to launch a child
>> process in order to process a submodule.  But you also need to be able to
>> communicate other stateful information to the children processes so that the
>> submodules know how best to format their output or match against a
>> pathspec...it ends up feeling like layering on hack after hack.  What I would
>> really like to do, is to have the ability to have a repository object so that I
>> can open a submodule in-process.
>
> We could always buy in fully to the multi-process model and just
> implement a generic RPC protocol between the parent and submodule gits.
> Does CORBA still exist?
>
> (No, I am not serious about any of that).

CORBA or not, submodule IPC is a real pain. That was what I felt
reading the super-prefix changes a few weeks ago. Some operations
might benefit from staying in the same process, but probably not all
(and we lose process protection, which sometimes is a good thing)

>> This is still very much in a WIP state, though it does pass all tests.  What
>> I'm hoping for here is to get a discussion started about the feasibility of a
>> change like this and hopefully to get the ball rolling.  Is this a direction we
>> want to move in?  Is it worth the pain?
>
> I think the really painful part is going to be all of the system calls
> that rely on global state provided by the OS. Like, say, every
> filesystem call that expects to find working tree files without
> prepending the working tree path.
>
> That said, even if we never reached the point where we could handle all
> submodule requests in-process, I think sticking the repo-related global
> state in a struct certainly could not hurt general readability. So it's
> a good direction regardless of whether we take it all the way.

I doubt we would reach the point where libgit.a can handle all
submodule operations in-process either. That would put libgit.a in a
direct competitor position with libgit2. I do hope though that having
clearer/modular data structure will improve readability, not hurt it
(e.g. you see the data model and could largely guess how the code
interacts before digging deep in).
-- 
Duy