RE: Negative patterns in cone mode

"Brown, Chris" <chris.c.brown@xxxxxxxxxxx> · Sat, 8 Apr 2023 21:52:09 +0000

Thanks for the inputs. I came to a similar conclusion after poring over the docs. We have a similar situation to you; by excluding ~50 directories not required at build time we can avoid 18GB of files on disk and reduce the file count by 2x. I ended up writing a python script that uses git ls-tree and then converts a few negative patterns specified by the developer into a huge set of positive patterns for all directories in the tree *except* those that should be excluded. The performance is good; the python script takes around 1s, and then allows the sparse checkouts to operate in cone mode which works in seconds. This is great compared to non-cone-mode processing which takes several minutes to sparse-checkout the same directories expressed directly as negative patterns.

This suggests to me that cone mode *could* be enhanced to natively support a restricted type of negative pattern (exclude this directory and all subdirs) without performance overhead.

The problem with my script is that it is quite complex, generates thousands of positive patterns, and I am not yet 100% convinced that the complexity is worth it over simply paying the cost to download the monorepo.

Chris

-----Original Message-----
From: Rudy Rigot <rudy.rigot@xxxxxxxxx> 
Sent: 08 April 2023 16:43
To: Brown, Chris (DI SW LCS CF) <chris.c.brown@xxxxxxxxxxx>
Cc: git@xxxxxxxxxxxxxxx
Subject: Re: Negative patterns in cone mode

Hi,

> I'm facing an issue with negative patterns in cone mode.
> I can't tell from the docs or git code if I misunderstand the usage, 
> am trying something not supported, or if there is a bug.

My understanding so far, and I would appreciate if someone can correct me if I'm wrong, is that the point of cone mode is that there can't be negative patterns, and everything is a positive rule, so the match search can stop as soon as a positive rule is found.

My understanding has been that it was designed with the use case in mind of large mono-repos made of several independent applications, of which a given developer only needs a few. For instance, if I am an iOS developer, I will configure my sparse checkout to have the back-end code and the iOS code, but not the front-end code and the Android code.

I don't know if that's accurate because I'm not as well-versed about it as I should be, so I would appreciate if someone could correct my understanding. It is the chief reason we are sticking with non-cone mode with our massive monolith at
Salesforce: it is not a mono-repo of independent applications, but one massive monolith of which only a few (very large) files are not needed for all devs.

Thanks in advance for anyone who may have insights.