On 2022-10-05 at 11:23:00, Alireza wrote: > There are a few mechanisms already to improve perf in big repositories > but they all need a change in usage flow. I had this idea for a while > now and I'd appreciate your feedback on it. > > The "connected mode" essentially means to run all git commands on the > server and only download relevant stuff locally. To demonstrate the > usage flow: > > git clone --connected <url> # new repo > git config fetch.connected true # existing repo > > From there, git is to decide whether or not a command should be sent > to the server. For instance, if all required refs are present locally, > it's run on the machine, otherwise it's sent to the server, collecting > the result and possibly a minimum set of new objects. From the user's > perspective, all commands are run on the latest revision without an > explicit (possibly extensive) fetch. > > This would make a --connected clone implicitly shallow, but new data > can be downloaded on demand. User flow is not changed in any other > ways. I think you may be interested in partial clone (e.g., a clone with `--filter=blob:none`), which can download the commits and trees but not the blobs for a repository. The blobs are then automatically downloaded from the remote on demand. There are other partial clone filters, but that particular one is the most common and the best supported. Note that in this case, the history is complete and the repository is not shallow; only the blobs are missing. As for performing the work on the server side, I feel like that is unlikely to happen, since it's very hard to account for the security concerns running arbitrary commands as well as the potential unbounded computational cost. Therefore, it's not likely that most providers would implement a feature if it were added. -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA
Attachment:
signature.asc
Description: PGP signature