Re: Question relate to collaboration on git monorepo

ZheNing Hu <adlternative@xxxxxxxxx> · Wed, 21 Sep 2022 23:42:15 +0800

Elijah Newren <newren@xxxxxxxxx> 于2022年9月21日周三 09:48写道：
>
> On Tue, Sep 20, 2022 at 5:42 AM ZheNing Hu <adlternative@xxxxxxxxx> wrote:
> >
> > Hey, guys,
> >
> > If two users of git monorepo are working on different sub project
> > /project1 and /project2 by partial-clone and sparse-checkout ,
> > if user one push first, then user two want to push too, he must
> > pull some blob which pushed by user one.
>
> This is not true.  While user two must pull the new commit and any new
> trees pushed by user one (which will mean knowing the hashes of the
> new files), there is no need to download the actual content of the new
> files unless and until some git command is run that attempts to view
> the file's contents.
>

Yeah, now I understand that git fetch will not download blobs out of
the sparse-checkout pattern, but git merge will. So git pull will
download some missing blobs here.

> > The large number of interruptions in git push may be another
> > problem, if thousands of probjects are in one monorepo, and
> > no one else has any code that would conflict with me in any way,
> > but I need pull everytime? Is there a way to make improvements
> > here?
>
> No, you only need to pull when attempting to push back to the server.
>
> Further, if you're worried that the second push will fail, you could
> easily script it and put "pull --rebase && push" in a loop until it
> succeeds (I mean, you did say no one would have any conflicts).  In
> fact, you could just make that a common script distributed to your
> users and tell them to run that instead of "git push" if they don't
> want to worry about manually updating.
>

Ah, This method looks a little funny, but it maybe can work. This
issue may also apply to some Code Review tools, maybe need
a "pull --rebase && git cr" loop.

> Now, if you have thousands of nearly fully independent subprojects and
> lots of developers for each subproject and they all commit & push
> *very* frequently, I guess you might be able to eventually get to the
> scale where you are worried there will be so much contention that the
> script will take too long.  I'd be surprised if you got that far, but
> even if you did, you could easily adopt a lieutenant-like workflow
> (somewhat like the linux kernel, but even simpler given the
> independence of your projects).  In such a workflow, you'd let people
> in subprojects push to their subproject fork (instead of to the "main"
> or "central" repository), and the lieutenants of the subprojects then
> periodically push work from that subproject to the main project in
> batches.
>

Make sense. When this mono-repo really has this kind of scale,
splitting the workflow might be the right thing to do.

> I don't really see much need here for improvements, myself.
>
> > Here's an example of how two users constrain each other when git push.
>
> Did you pay attention to warnings you got along the way?  In particular...
>
> > git clone --bare mono-repo
>
> You missed the following command right after your clone:
>
>    git -C mono-repo.git config uploadpack.allowFilter true
>
> > # user1
> > rm -rf m1
> > git clone --filter="blob:none" --no-checkout --no-local ./mono-repo.git m1
>
> Since you forgot to set the important config I mentioned above, your
> command here generates the following line of output, among others:
>
>     warning: filtering not recognized by server, ignoring
>
> This warning means you weren't testing partial clones, but regular
> full clones.  Perhaps that was the cause of your confusion?

Oh, sorry for forget record this, I have config them globally:

uploadpack.allowanysha1inwant=true
uploadpack.allowfilter=true

Thanks for the answer,
ZheNing Hu