On Tue, Jul 5, 2022 at 6:08 AM Dian Xu <dianxudev@xxxxxxxxx> wrote: > > Hi Elijah, Hi Dian, Please don't top post on this list. It'd also help to respond to the relevant email instead of picking a different email in the thread to put your answers in. Anyway, that aside... > Please see answers below: > > 1. H: 2.27m; S: 7.7k; Total: 2.28m > > 2. Sure I will run 'reapply' after the sparse-checkout file has > changed. Just curious, do I have to run 'reapply' if 'checkout' is the > next immediate cmd? I thought 'checkout' does the updating index as > well > > 3. I simply added one file only, 'git add' and 'git add --sparse' > still hang. Let me know if you need me to send you any debug info from > pathspec.c/dir.c > > 4. Good to know and we are investigating if we have a way out from --no-cone > > 5. I should've been clearer: The experiment done here uses 2.37.0 Thanks for providing these details. It was enough to at least get me started, and from my experiments, it appears the arguments to `git add` are important. In particular, I could not trigger this when passing actual filenames that existed. I could when passing a fake filename. Here's the concrete steps I used to reproduce: git clone git@xxxxxxxxxx:newren/gvfs-like-git-bomb cd gvfs-like-git-bomb git init attempt cd attempt ../make-a-git-bomb.sh time git checkout bomb echo "/*" >.git/info/sparse-checkout echo '!/bomb/j/j/' >>.git/info/sparse-checkout for i in $(seq 1 10000); do printf '!some/random/file/path-%05d\n' $i done >>.git/info/sparse-checkout git config core.sparseCheckout true time git sparse-checkout reapply echo hello >world time git add --sparse world nonexistent time git rm --cached --sparse world nonexistent time git add world nonexistent time git rm --cached world nonexistent This sequence of steps will (1) clone a repo with 2 files, (2) create another repository in subdirectory 'attempt' that has 1000001 files (but only two unique files, and only six or so unique trees) in a branch called 'bomb', (3) check it out, (4) create 10002 patterns for the sparse-checkout file (only the first 2 of which match anything) which will leave ~99% of files still present (990001 files checked out and 10000 files sparse) and turn on sparsity, (5) measure how long it takes to add and remove a file from the index, both with and without the --sparse flag, but always listing an extra path that won't match anything. The timings I see for the setup steps are: 4m10.444s checkout bomb 1m0.380s sparse-checkout reapply And the timings for the add/rm steps are: 4m43.353s add --sparse world nonexistent 9m25.666s add world nonexistent 0m0.129s rm --cached --sparse world nonexistent 9m23.601s rm --cached world nonexistent which shows that 'rm' also has a performance problem without the '--sparse' flag (which seems like another bug). Now, if I remove the 'nonexistent' argument from the commands, then the timings drop to: 0m0.236s add --sparse world 0m0.233s add world 0m0.175s rm --cached --sparse world 4m43.744s rm --cached world So, I can reproduce some slowness. 'rm' without --sparse seems buggily slow for either set, whereas 'add' is only slow when given a fake path. You never mentioned anything about the arguments you were passing to `git add`, so I don't know whether you are using specific filenames that just don't exist (like I did above), or globs that perhaps match some files, or something else. That might be useful to know. But there appears to be something here for both 'add' and 'rm' that we could look into optimizing. I don't have time right now. I'm not sure if someone else has some time to look into it; if no one else does, I'll eventually try to get back to it.