On Fri, Aug 23, 2019 at 6:59 PM <randall.s.becker@xxxxxxxxxx> wrote: > > Hi All, > > I'm trying to answer a question for a customer on clone performance. They > are doing at least 2-3 clones a day, of repositories with about 2500 files > and 10Gb of content. This is stressing the file system. Can you go into a bit more detail about what "stress" means? Using too much disk space? Too many IOPS reading/packing? Since you specifically called out the filesystem, does that mean the CPU/memory usage is acceptable? Depending on how well-packed the repository is, Git will reuse a lot of the existing pack (and a "perfectly" packed repository can achieve complete reuse, with no "Compressing objects" phase at all). Delta islands[1] can help increase reuse and reduce the need for on-the-fly compression, if the repository includes a lot of refs that aren't generally cloned. Another relatively recent addition is uploadpack.packobjectshook[2], which can simplify caching of packfiles so they can be reused on subsequent requests. Whether or not this will be beneficial is likely to be influenced by how many times the exact same commits are cloned and how much extra disk space is available for storing cached packs. Not sure if any of this is helpful, but I hope it will be! Bryan [1] https://git-scm.com/docs/git-pack-objects#_delta_islands [2] https://git-scm.com/docs/git-config#Documentation/git-config.txt-uploadpackpackObjectsHook > I have tried to > convince them that their process is not reasonable and should stick with > existing clones, using branch checkout rather that re=cloning for each > feature branch. Sadly, I have not been successful - not for a lack of > trying. Is there any way to improve raw clone performance in a situation > like this, where status really doesn't matter, because the clone's life span > is under 48 hours. > > TIA, > Randall > >